Teradata Interview Questions and Answers
A list of top frequently asked Teradata Interview Questions and Answers are given below.
1) What is Teradata? What are some primary characteristics of Teradata?
Teradata is an RDBMS (Relational database management system) which is perfect to use with large-scale data warehousing application. It works on the parallelism concept. It is an open system. It can run on Windows/ UNIX/ Linux server platform. Teradata provides support to multiple data warehouse operations at the same time to different clients.
It is developed by an American IT firm called Teradata corporation. It is a dealer of analytic data platforms, applications, and other related services.
Characteristics of Teradata
- It is compatible with the American National Standards Institute (ANSI).
- It acts in a way as a server does.
- It is an open system.
- It is single.
- It has multi-node running capabilities.
- It is built on parallelism.
2) What are the different table types supported by Teradata?
There are four types of tables as per data storage in Teradata:
- Permanent table
- Global Temporary Table (GTT)
- Volatile table
- Derived table
Permanent Table
These are the Default table types in Teradata. Some of its characteristics are as follows.
- As its name suggests these tables remains in the system until it is dropped.
- Data is stored in a stable space.
- The permanent table definition is stored in the data dictionary.
Global Temporary Tables
Global Temporary tables are also another kind of permanent tables. These tables are used to store the globally used values throughout the application, and the lifetime is limited to the user session. Once the user session is over, the table will be dropped.
- The global temporary table definition is stored in the data dictionary
- Data is stored in temporary space
- Collect statistics supported.
Volatile Tables
Volatile tables are used to store the user session data only. At the end of a particular user session, the table will drop. Volatile tables are essential to store in-between data during data transmission or in complex calculations.
Derived Tables
Derived tables have the smallest lifetime among all the tables. These tables hold the intermediate results during the query execution. These tables are created, used and dropped within a query.
3) What is the difference between Teradata and Oracle?
The Teradata and Oracle both are the Relational database management systems. However, Oracle supports an Object-Relational Database Management System (ORDBMS).
Let’s check out some differences between Teradata and Oracle based on the following parameters.
Architecture
Oracle is Shared Everything Architecture whereas Teradata is Shared Nothing (SN) Architecture.
Here the term Shared architecture is referred to a multiprocessor database management system where memory and disk storage is shared between the processors.
Parallelism
Oracle has conditional parallelism whereas Teradata has unconditional parallelism. It gives Teradata advantage over OLAP, which results in the exceptional performance than a non-parallel system. Parallelism needs a multi-processor system.
Scalability
Scalability contains several aspects of an IT infrastructure such as data handling ( Increases in Data and transactional volume) as well as the increase in multidimensional data, number of users, query complexity, etc.
Teradata is Linearly Scalable. Linearly scalable means the database capacity can be increased by adding more nodes to the infrastructure, and when the data volume increases, performance is not affected.
4) What are the Updated features of Teradata?
Some of its newly developed features are as follows.
- Automated temporal analytics.
- JavaScript object Notation
- Teradata QueryGrid
- XML Data Type
- Performance
- Data Compression
- Customer associated innovation like Teradata viewpoint.
5) What is the Multi-insert?
Inserting data records into the table using more than one insert statements are referred to as Multi-insert. We can achieve it by putting a semicolon in front of the keyword INSERT in the next statement rather than terminating the first statement with a semicolon.
Insert into Cname “select * from customer”;
Insert into amount “select * from customer”;
6) What is BTEQ utility in Teradata?
BTEQ utility is the most powerful utility in Teradata. It is useful for both batch and interactive mode. It can also be used to run any DDL statement, DML statement, Create macros, and stored procedures. One another important use of BTEQ Is to import data into Teradata tables from a flat-file. It is also useful for extracting data from tables into files or reports.
7) What are some commonly used BTEQ scripts?
Some commonly used BTEQ scripts are as follows.
- LOGON: It is used to log into the Teradata system.
- ACTIVITYCOUNT: It returns the number of rows affected by the most recently used query.
- ERRORCODE: It returns the status code of the most recently used query.
- DATABASE: It sets the default database.
- LABEL: It assigns a label to a set of SQL commands.
- RUN FILE: It executes the query contained in a file.
- GOTO: It transfers control to a label.
- LOGOFF: It logs off from the database and terminates all sessions.
- IMPORT: it specifies the input file path.
- EXPORT: It specifies the output file path and initiates the export.
8) What is the difference between fastload and multiload? Which one is faster?
Fastload uses multiple sessions to rapidly load a large amount of data on an empty table, while Multiload is used for high-volume maintenance on tables and views. Multiload works with non-empty tables also. Multiload can use a maximum of five tables.
If we talk about the faster one, then Fastload is faster than multi-load.
9) What is the difference between Teradata and basic RDBMS?
Teradata | Basic RDBMS |
---|---|
It has a large number of different destinations | Basic RDBMS has a lack of various destinations. |
Source operation is allowed in Teradata. | It is not necessary that source operation is always allowed in basic RDBMS. |
Components can be reused for any number of times. | Reusability of components is limited. |
Debugging is easy in Teradata. | Debugging is complicated. |
10) Explain AMP in Teradata?
AMP is an integral part of Teradata Architecture. The term AMP stands for Access module Processor. It stores the data on the disks. AMP is a part of the following activities.
- It manages a portion of the database
- It maintains a part of each table.
- It accomplishes all the tasks associated with generating result set such as sort, join, and aggregation.
- It performs space and lock management.
11) What is SMP and MPP platforms?
SMP technology is related to hardware. The hardware that supports Teradata database software is based on SMP (Symmetric multiprocessing) technology. The hardware can be combined with a communications network that connects the SMP systems to form MSP (Massively Parallel Processing) systems.
12) Explain some differences between MPP and SMP?
MPP
- MPP (Massively Parallel Processing) is a Computer system which is attached to many independent arithmetic units or entire microprocessors that run in parallel.
- Databases can be expanded by adding additional CPUs.
- An MPP environment does not share resources among physical computers, so the performance in MPP environment is improved.
- Performance of an MPP system is linear, so it increases in proportion to the number of nodes.
SMP
- In an SMP (Symmetric Multi-Processing) processing system, the CPU shares the same memory. So the result code running in one system may affect the memory used by another.
- SMP databases usually use one CPU to perform database searches.
- The workload for a parallel task is allocated across the processors in the system.
- SMP databases can run on several servers. However, they will share another resource.
13) Did You Write Stored Procedures In Teradata?
No, because the stored procedures become a particular AMP operation and no company will encourage that.
14) What Is the Use of having Index on Table?
Index table facilitates with the faster and efficient search of the record.
15) How to find duplicates in a Table?
To find the duplicates in a table, Group by those fields and select id, count(*) from table group by id having count (*) > 1.
16) Why managing the data is important?
Data is the ultimate source of deriving useful information. With data, many important tasks such as business management, problem formulation, decision making, and many other valuable tasks can be accomplished easily. When the data is not managed, then there are substantial chances that the user will get the errors. A well-managed data always allows users to save time, and to analyze things easily. There are a lot of other reasons as well due to which data management is important.
17) What exactly do you know about Catching in Teradata?
It is an add-on feature in Teradata which let the users to share the cache easily with all the applications because it works closely with the source and even let the users mound the outcomes in the manner they are comfortable with. This approach saves time when the data is complex and contain so many errors associated with them.
18) How you will check the version of Teradata?
It can be checked with the following command
“.SHOW VERSION”.
19) Explain the Parallel Data Extension in Teradata?
PDE is a software interface layer that lies between the Teradata Database and operating system. PDE supports the parallelism through system nodes. It contributes to Teradata Database speed and linear scalability. Many utilities like diagnostic and troubleshooting work at the PDE level.
PDE tools are a collection of PDE utilities that come with Teradata Database. They are not listed in Utilities because PDE tools have online documentation accessible from a system console using the “pdehelp” and “man” commands.
20) What is the use of FALLBACK?
FALLBACK is a unique feature used by Teradata to handle AMP failures. It protects data in case of AMP vproc failure. Fallback is very useful for the application that requires high availability.
Fall back is automatic; it is enabled by default when you deploy a Teradata database. The fallback setting can’t be overridden during or after table creation. Fallback is transparent; it protects data by storing a second copy of each row of a table on any other AMP in the same cluster. Fallback facilitates with AMP fault tolerance at the table level.
21) What is Database exceptions in Teradata?
Teradata Database deals with the same features that come with an on-premises Teradata Database system with the following exceptions:
- Teradata database data block read-ahead count is only 15 data blocks.
- Teradata database has a default PERM DB size for permanent tables is 254 sectors.
- Teradata database has a default WORK DB size for temporary tables is 254 sectors, sometimes referred to as SPOOL DB size.
- In Teradata, one single transaction can consume 100% of FSG cache.
Teradata Database 16.10 does not support Multiple Hash Maps feature in the public cloud.
22) List out Teradata data types?
The list of some basic datatypes in Teradata is as follows.
Data Types | Length (Bytes) | Range of values |
---|---|---|
BYTEINT | 1 | -128 to +127 |
SMALLINT | 2 | -32768 to +32767 |
INTEGER | 4 | -2,147,483,648 to +2147,483,647 |
BIGINT | 8 | -9,233,372,036,854,775,80 8 to +9,233,372,036,854,775,8 07 |
DECIMAL | 1-16 | |
NUMERIC | 1-16 | |
FLOAT | 8 | IEEE format |
CHAR | Fixed Format | 1-64,000 |
VARCHAR | Variable | 1-64,000 |
DATE | 4 | YYYYYMMDD |
TIME | 6 or 8 | HHMMSS.nnnnnn+HHMM or HHMMSS.nnnnnn |
CHAR | Fixed Format | 1-64,000 |
TIMESTAMP | 10 or 12 | YYMMDDHHMMSS.nnnnnn +HHMM or YYMMDDHHMMSS.nnnnnn |
23) Describe Primary index in Teradata. And what are the available primary index types?
The technique to specify where the data exist in the Teradata is called primary index. Each table should contain a primary index specified, if not, Teradata will assign a primary index for the table. The main index provides faster data access and search.
There are two types of primary indexes in Teradata:
- Unique Primary Index(UPI)
- Non-Unique Primary Index(NUPI)
24) Why is the CASE Expression used in Teradata?
CASE Expression is used to evaluate each case against a specific condition and returns the result based on the first match. When there is no case that will match condition, then else part will return.
The basic syntax of a CASE expression is as follows:
25) What are the Joins in Teradata and How many types of Joins are there in Teradata?
Joins combine the record from more than one table using common columns or value.
There are seven types of joins associated with Teradata.
Inner Join
Inner joins combine the records from multiple tables and returns the value set that is common in both tables.
Left Outer Join
Left outer join returns all the records in the left table and only common records from the right table.
Right Outer Join
Right outer join returns all the records in the right table and only common records from the left table.
Full Outer Join
It is a combination of Left Outer Join and Right Outer Join. It returns both common and distinct records from both the tables.
Self-Join
Self-join compares the value in a column with the other values in the same column of the table.
Cross Join
Cross join joins every row from the left table to every row in the right table.
Cartesian Production Join
It works the same as cross join.
26) What is called Partitioned Primary Index (PPI) and discuss the advantages of using it in a query?
Partitioned Primary Index (PPI) is an indexing technique that allows for improving the performance of specific queries. Partitioned Primary Index (PPI) is defined within a table, and rows are sorted according to their partition number. Their row hash arranges records.
Advantages of Partitioned Primary Index (PPI):
- PPI helps to avoid a full table scan and only required partitions are accessed.
- PPI avoids using the secondary index, and it helps to prevent additional I/O maintenance.
- PPI allows Quick access to the subset of a large table.
- PPI facilitates with easy to drop old data and add new data.
27) Define Views in Teradata with syntax.
Database objects that are built using queries on tables are termed as views. The definition of view is stored permanently in the data definition. Data for the view is a dynamic process at the execution time.
Syntax
28) Describe the set operators in Teradata.
Set operators are used to batch the result from multiple SELECT statements. Set operator is different from joins because joins batch the columns in multiple tables, but set operators batch multiple rows.
Given below are the four Set operators in Teradata:
- UNION
- UNION ALL
- INTERSECT
- MINUS/EXCEPT
29) What is Upsert statement in Teradata?
In Teradata, we can combine the update and insert statement into a single statement. It is called an Upsert statement.
30) What are the String Manipulation operators and functions associated with Teradata?
Teradata String functions are used for string manipulation. It concatenates strings and creates a single string. It also supports some standard string functions along with the Teradata extension to those functions.
- SUBSTR: It is used to extract only a portion of the long string depends on Teradata extension.
- SUBSTRING: It is used to extract only a portion of the long string depends on the ANSI extension.
- INDEX: It is used to locate a specific position of a character string depends on the Teradata extension.
- POSITION: It is used to locate a specific position of a character string depends on the ANSI extension.
- TRIM: It trims blank from the specified string.
- UPPER: It converts the string to uppercase.
- LOWER: It converts the string to lowercase.