Informatica Interview Questions And Answers
1. What can we do to improve the performance of Informatica Aggregator Transformation?
Answer: Aggregator performance improves dramatically if records are sorted before passing to the aggregator and “sorted input” option under aggregator properties is checked. The recordset should be sorted on those columns that are used in Group By operation.
It is often a good idea to sort the record set in database level e.g. inside a source qualifier transformation unless there is a chance that already sorted records from source qualifier can again become unsorted before reaching aggregator
You may also read this article to know how to tune the performance of aggregator transformation.
2. What is Informatica ETL Tool?
Informatica ETL tool is a market leader in data integration and data quality services. Informatica is a successful ETL and EAI tool with significant industry coverage.ETL refers to extract, transform, load. Data integration tools are different from other software platforms and languages.
They have no inbuilt feature to build user interface where an end-user can see the transformed data. Informatica ETL tool “power center” can manage, integrate and migrate enterprise data.
3. Can we use the mapping parameter or variables developed in one mapping into any other reusable transformation?
Yes, we can use the mapping parameter or variables into any other reusable transformation because it doesn’t have any mapplet.
4. What do you understand by term role-playing dimension?
The dimensions that are used for playing diversified roles while remaining in the same database domain are known as role-playing dimensions.
5. What are the features of complex mapping?
The features of complex mapping are:
Many numbers of transformations
tricky needs compound business logic
6. What is the different lookup cache(s)?
Informatica Lookups can be cached or un-cached (No cache). And Cached lookup can be either static or dynamic. A static cache is one which does not modify the cache once it is built and it remains the same during the session run. On the other hand, A caches refreshed during the session run by inserting or updating the records in the cache based on the incoming source data.
By default, Informatica cache is static cache. A lookup cache can also be divided as persistent or non-persistent based on whether Informatica retains the cache even after the completion of session run or deletes it.
7. Write the advantages of partitioning a session?
Answer: The main advantage of partitioning a session is to get a better server’s process and competence. Another advantage is it implements the solo sequences within the session.
8. Differentiate between mapping parameter and mapping variable?
Answer: At the time values alter during the session’s implementation it is known as mapping variable whereas the values that don’t alter within the session implementation is called as mapping parameters.
9. How can we update a record in the target table without using Update strategy?
Answer: A target table can be updated without using ‘Update Strategy’. For this, we need to define the key in the target table in Informatica level and then we need to connect the key and the field we want to update in the mapping Target. In the session level, we should set the target property as “Update as Update” and check the “Update” check-box. Let’s assume we have a target table “Customer” with fields as “Customer ID”, “Customer Name” and “Customer Address”.
Suppose we want to update “Customer Address” without an Update Strategy. Then we have to define “Customer ID” as a primary key in Informatica level and we will have to connect Customer ID and Customer Address fields in the mapping. If the session properties are set correctly as described above, then the mapping will only update the customer address field for all matching customer IDs.
10. What are the new features of Informatica 9.x Developer?
Answer: From an Informatica developer’s perspective, some of the new features in Informatica 9.x are as follows: Now Lookup can be configured as an active transformation – it can return multiple rows on a successful match
Now you can write SQL override on un-cached lookup also. Previously you could do it only on cached lookup
You can control the size of your session log. In a real-time environment, you can control the session log file size or time
Database deadlock resilience feature – this will ensure that your session does not immediately fail if it encounters any database deadlock, it will now retry the operation again. You can configure several retry attempts.
11. How to elaborate Tracing Level?
Answer: Tracing Level can be defined as the amount of information that the server writes in the log file. Tracing Level is created and configured either at the transformation level or at the session-level else at both the levels.
Given below are the 4 types of tracing level:
12. Differentiate between joiner and Lookup Transformation?
Below are the differences between lookup and joiner transformation:
In lookup, we can override the query but in joiner, we cannot.
In lookup we can provide different types of operators like – but, in joiner only “= “ (equal to )operator is available.
In lookup we can restrict the number of rows while reading the relational table using lookup override but, in joiner, we cannot restrict the number of rows while reading.
In joiner we can join the tables based on- Normal Join, Master Outer, Detail Outer, and Full Outer Join but, in lookup, this facility is not available. Lookup behaves like Left Outer Join of the database.
13. What is meant by Lookup Transformation? Explain the types of Lookup transformation?
Lookup transformation in mapping is used to look up data in a flat-file, relational table, view, or synonym. We can also create a lookup definition from a source qualifier.
We have the following types of Lookup.
Relational or flat-file lookup. To perform a lookup on a flat file or a relational table.
Pipeline lookup. To perform a lookup on application sources such as JMS or MSMQ.
Connected or unconnected lookup.
A connected Lookup transformation receives source data, performs a lookup, and returns data to the pipeline.
An unconnected Lookup transformation is not connected to a source or target. A transformation in the pipeline calls the Lookup transformation with a: LKP expression. The unconnected Lookup transformation returns one column to the calling transformation.
Cached or un-cached lookup. We can configure the lookup transformation to Cache the lookup data or directly query the lookup source every time the lookup is invoked. If the Lookup source is Flat file, the lookup is always cached.
14. How can you increase the performance in joiner transformation?
Below are how you can improve the performance of Joiner Transformation.
Perform joins in a database when possible.
In some cases, this is not possible, such as joining tables from two different databases or flat file systems. To perform a join in a database, we can use the following options:
Create and Use a pre-session stored procedure to join the tables in a database.
Use the Source Qualifier transformation to perform the join.
Join sorted data when possible
For an unsorted Joiner transformation, designate the source with fewer rows as the master source.
For a sorted Joiner transformation, designate the source with fewer duplicate key values as the master source.
15. What is the Lookup transformation?
Answer: Lookup transformation is used to look up a source, source qualifier, or target in order to get relevant data. You can look up a ‘flat-file’, ‘relational table’, ‘view’ or ‘synonym’. Lookup can be configured as Active or Passive as well as Connected or Unconnected transformation. When the mapping contains the lookup transformation, the integration service queries the lookup data and compares it with lookup input port values. One can use multiple lookup transformations in a mapping.
The lookup transformation is created with the following type of ports:
Input port (I)
Output port (O)
Look up Ports (L)
16. What are the differences between a connected lookup and unconnected lookup?
Answer: The connected lookup takes the inputs directly from other transformations in the pipeline. Unconnected lookup does not take inputs directly from other transformations. It can be used in any transformation and can be raised as a function using the LKP expression. An unconnected lookup can be called multiple times in your mapping data flow.
17. How many input parameters can exist in an unconnected lookup?
Answer: Any number of input parameters can exist. For instance, you can provide input parameters like column 1, column 2, column 3, and so on. But the return value would only be one.
18. What is the difference between active and passive transformation?
Active Transformation:- An active transformation can perform any of the following actions:
Change the number of rows that pass through the transformation: For instance, the Filter transformation is active because it removes rows that do not meet the filter condition.
Change the transaction boundary: For e.g., the Transaction Control transformation is active because it defines a commit or rolls back transaction based on an expression evaluated for each row.
Change the row type: For e.g., the Update Strategy transformation is active because it flags rows for insert, delete, update, or reject.
Passive Transformation: A passive transformation is one which will satisfy all these conditions:
Does not change the number of rows that pass through the transformation
Maintains the transaction boundary
Maintains the row type
19. What is a surrogate key?
It is used in substitution for the natural primary key
20. What is the difference between static cache and dynamic cache?
Answer: In the static cache, the data will remain the same for the entire session, whereas in the dynamic cache, whenever a row is inserted the cache will also be updated.
21. What is the command used to run a batch?
The pmcmd command.
These are some of the popular questions asked during an Informatica interview. Always be prepared to answer all types of questions — technical skills, interpersonal, leadership or methodology. If you have recently started your career in a data profile, you can always get certified to understand the industry-related terminology, skills, and methodologies.
22. What are the advantages of Informatica?
Informatica has some advantages over other data integration systems. A couple of advantages are:
It is faster than the available platforms.
You can easily monitor your jobs with Informatica Workflow Monitor.
It has made data validation, iteration and project development to be easier than before.
If you experience failed jobs, it is easy to identify the failure and recover from it. The same applies to jobs that are running slowly.
23. State the differences between SQL Override and Lookup Override?
The role of SQL Override is to limit the number of incoming rows entering the mapping pipeline, whereas Lookup Override is used to limit the number of lookup rows to avoid the whole table scan by saving the lookup time and the cache it uses.
Lookup Override uses the “Order By” clause by default. SQL Override doesn’t use it and should be manually entered in the query if we require it
SQL Override can provide any kind of ‘join’ by writing the query
Lookup Override provides only Non-Equi joins.
Lookup Override gives only one record even if it finds multiple records for a single condition
SQL Override doesn’t do that.
If you want to get hands-on learning on Informatica, you can also check out the tutorial given below. In this tutorial, you will learn about Informatica Architecture, Domain & Nodes in Informatica, and other related concepts.
24. What is parallel processing in Informatica?
After optimizing the session to its fullest, we can further improve performance by exploiting underutilized hardware power. This refers to parallel processing and we can achieve this in Informatica Powercenter using Partitioning Sessions.
The Informatica Powercenter Partitioning Option increases the performance of the Powercenter through parallel data processing. The Partitioning option will let you split the large data set into smaller subsets which can be processed in parallel to get a better session performance.
25. Explain the shared cache and re-cache?
To answer this question, it is essential to understand persistence cache. If we are performing the lookup on a table, it looks up all the data brings it inside the data cache. However, at the end of each session, the Informatica server deletes all the cache files. If you configure the lookup as a persistent cache, the server saves the lookup under an anonymous name. Shared cache allows you to use this cache in other mappings by directing it to an existing cache.
After a while, data in a table becomes old or redundant. In a scenario where new data enters the table, re-cache ensures that the data is refreshed and updated in the existing and new cache.
I hope this Informatica Interview questions blog was of some help to you. We also have another Informatica Interview questions wherein scenario-based questions have been compiled. It tests your hands-on knowledge of working on Informatica tool. You can go through that Scenario-based Informatica Interview Questions blog by clicking on the hyperlink or by clicking on the button at the right-hand corner.
26. What is an Expression Transformation in Informatica?
Answer: An expression transformation in Informatica is a common Powercenter mapping transformation. It is used to transform data passed through it one record at a time. The expression transformation is passive and connected. Within an expression, data can be manipulated, variables created, and output ports generated. We can write conditional statements within output ports or variables to help transform data according to our business requirements.
27. How to Delete duplicate row using Informatica?
Scenario 1: Duplicate rows are present in the relational database
Suppose we have Duplicate records in Source System and we want to load only the unique records in the Target System eliminating the duplicate rows.
28. What is XML Source Qualifier Transformation in Informatica?
1. Reads the data from XMl files.
2. XML source definition associates with XML source Qualifier.
3. XML files are case sensitive markup language.
4. Files are saved with an extension.XML.
5. XML files are hierarchical (or) parent-child relationship file formats.
6. Files can be normalized or denormalized.
29. What is Informatica Power Center?
– A data integration tool which combines the data from multiple OLTP source systems transforms the data into a homogeneous format and delivers the data throughout the enterprise at any speed.
– It is a GUI based ETL product from Informatica corporation which was founded in 1993 Redwood City, California.
– There are many products in Informatica corporation:
1. Informatica Analyzer.
2. Life cycle management.
3. Master data
Having many products in Informatica.
Informatica power center is one of the products of information.
Using Informatica power center we will do the Extraction, transformation, and loading.
30. What is meant by active and passive transformation?
An active transformation is the one that performs any of the following actions:
Change the number of rows between transformation input and output. Example: Filter transformation
Change the transaction boundary by defining commit or rollback points., example transaction control transformation
Change the row type, example Update strategy is active because it flags the rows for insert, delete, update or reject
On the other hand, a passive transformation is the one which does not change the number of rows that pass through it. Example: Expression transformation.
31. What are the new features of Informatica 9.x in developer level?
Answer: We can configure a Lookup transformation to cache the underlying lookup table. In case of static or read-only lookup cache, the Integration Service caches the lookup table at the beginning of the session and does not update the lookup cache while it processes the Lookup transformation.
In case of dynamic lookup cache, the Integration Service dynamically inserts or updates data in the lookup cache and passes the data to the target. The dynamic cache is synchronized with the target.
In case you are wondering why do we need to make lookup cache dynamic, read this article on dynamic lookup
32. What is the expected value if the column in an aggregator transform is neither a group by nor an aggregate expression?
Answer: Integration Service produces one row for each group based on the group by ports. The columns which are neither part of the key nor aggregate expression will return the corresponding value of the last record of the group received. However, if we specify particularly the FIRST function, the Integration Service then returns the value of the specified first row of the group. So the default is the LAST function.
33. How does a Sorter Cache works?
Answer: The Integration Service passes all incoming data into the Sorter Cache before Sorter transformation performs the sort operation.
The Integration Service uses the Sorter Cache Size property to determine the maximum amount of memory it can allocate to perform the sort operation. If it cannot allocate enough memory, the Integration Service fails the session. For best performance, configure Sorter cache size with a value less than or equal to the amount of available physical RAM on the Integration Service machine.
If the amount of incoming data is greater than the amount of Sorter cache size, the Integration Service temporarily stores data in the Sorter transformation work directory. The Integration Service requires disk space of at least twice the amount of incoming data when storing data in the working directory.
34. What is Union Transformation?
Answer: The Union transformation is an Active, Connected non-blocking multiple input group transformation uses to merge data from multiple pipelines or sources into one pipeline branch. Similar to the UNION ALL SQL statement, the Union transformation does not remove duplicate rows.
35. What is the difference between Router and Filter?
Following differences can be noted:
Router transformation divides the incoming records into multiple groups based on some condition. Such groups can be mutually inclusive (Different groups may contain the same record) Filter transformation restricts or blocks the incoming recordset based on one given condition.
Router transformation itself does not block any record. If a certain record does not match any of the routing conditions, the record is routed to default group Filter transformation does not have a default group. If one record does not match filter condition, the record is blocked
The router acts like CASE. WHEN statement in SQL (Or Switch().. Case statement in C) Filter acts like WHERE condition is SQL.
36. What will be the approach?
Answer: Assuming that the source system is a Relational Database, to eliminate duplicate records, we can check the Distinct option of the Source Qualifier of the source table and load the target accordingly.
37. But what if the source is a flat-file? Then how can we remove the duplicates from flat file source?
Scenario 2: Deleting duplicate rows / selecting distinct rows for FLAT FILE sources
Here since the source system is a Flat File you will not be able to select the distinct option in the source qualifier as it will be disabled due to flat-file source table. Hence the next approach maybe we use a Sorter Transformation and check the Distinct option. When we select the distinct option all the columns will the selected as keys, in ascending order by default.
Deleting Duplicate Record Using Informatica Aggregator
Other ways to handle duplicate records in source batch run is to use an Aggregator Transformation and using the Group By checkbox on the ports having duplicate occurring data. Here you can have the flexibility to select the last or the first of the duplicate column value records.
38. Differences between connected and unconnected lookup?
Connected Lookup Unconnected Lookup
1. Part of the mapping dataflow 1. Separate from the mapping data flow.
2. Returns multiple values (by linking output ports to another transformation) 2. Return one value by checking the Return R port option for the outport that provides the return value.
3. Executed for every record passing through the transformation. 3. Only executed when the lookup function is called.
4. More visible shows where the lookup values are used. 4. Less visible, as the lookup is called from an expression within another transformation
5. Default values are used. 5. Default values are ignored.
39. How to use PMCMD Utility Command?
1. It is a command based client program that communicates with integration service to perform some of the tasks which can also be performed using a workflow manager client.
2. Using PMCMD we can perform the following tasks:
i. Starting workflow.
ii. Scheduling workflow.
3. The PMCMD can be operated in two different modes:
i. Interactive Mode.
ii. Command-line Mode.
40. What is Workflow Manager?
It is a GUI based client which allows you to create following ETL objects.
– A session is a task that executes mapping.
– A session is created for each Mapping.
– A session is created to provide runtime properties.
A session is a set of instructions that tells ETL server to move the data from source to destination.
Workflow is a set of instructions that tells how to run the session takes and when to run the session tasks.
41. Mention few Power Centre client applications with their basic purpose?
Tasks like session and workflow creation, monitoring workflow progress, designing applets, etc are performed by Power center client applications.
Enlisted below is the list of Power center client applications with their purpose:
Repository Manager: It is an administrative tool and its basic purpose is to manage repository folders, objects, groups, etc.
Administration Console: Here the service tasks like start/stop, backup/restore, upgrade/delete, etc are performed.
Power center designer: The designer consists of various designing tools which serve various purposes. These designing tools are:
Workflow Manager: Its basic purpose is to define a set of instructions/workflow that is required to execute mappings designed in the designer. To help develop a workflow, there are 3 tools available, namely Task developer, Workflow designer, Worklet Designer.
Workflow Monitor: As the name suggests, the Workflow monitor, monitors the workflow or tasks. The list of windows available are:
Gantt chart view
42. Why do we need Informatica?
Answer: Informatica comes to the picture wherever we have a data system available and at the backend, we want to perform certain operations on the data. It can be like cleaning up of data, modifying the data, etc. based on a certain set of rules or simply loading of bulk data from one system to another.
Informatica offers a rich set of features like operations at row level on data, integration of data from multiple structured, semi-structured or unstructured systems, scheduling of data operation. It also has the feature of metadata, so the information about the process and data operations are also preserved.
43. How we can confirm all mappings in the repository simultaneously?
At a time we can validate only one mapping. Hence mapping cannot be validated simultaneously.
44. What is Expression transformation?
It is used for performing non aggregated calculations. We can test conditional statements before output results move to the target tables.
45. What are the Different Components of PowerCenter?
Given below are the 7 important components of PowerCenter:
PowerCenter Administration Console
Web Service Hub
46. What happens to map if we alter the datatypes between Source and its corresponding Source Qualifier?
Answer: The Source Qualifier transformation displays the transformation datatypes. The transformation datatypes determine how the source database binds data when the Integration Service reads it.
Now if we alter the datatypes in the Source Qualifier transformation or the data types in the source definition and Source Qualifier transformation do not match, the Designer marks the mapping as invalid when we save it.