1. Explain the Percentage and row sampling Transformations?
Answer: This transformation will take data from the source and randomly sampling data. It gives you 2 outputs. First is selected data and the second one is unselected data. It is used in a situation where you train a data mining model. These two are used to take the SAMPLE of data from the input data.
2. Explain Sort Transformation?
Answer: This component will sort data, similar in TSQL command ORDER BY. Some transformations need sorted data.
3. Explain Union all Transformation?
Answer: It works in the opposite way to merge transformation. It can take the output from more than 2 input paths and combines into a single output path.
4. What are SSIS Connection Managers?
Answer: When we talk of integrating data, we are actually pulling data from different sources and writing it to a destination. But how do you get connected to the source and destination systems? This is where the connection managers come into the picture. Connection manager represents a connection to a system which includes data provider information, the server name, database name, authentication mechanism, etc. For more information check out the SQL Server Integration Services (SSIS) Connection Managers and Connection Managers in SQL Server 2005 Integration Services SSIS tips.
5. What is the Retain Same Connection property and what is its impact?
Answer: Whenever a task uses a connection manager to connect to the source or destination database, a connection is opened and closed with the execution of that task. Sometimes you might need to open a connection, execute multiple tasks and close it at the end of the execution. This is where Retain Same Connection property of the connection manager might help you. When you set this property to TRUE, the connection will be opened on the first time it is used and remain open until execution of the package completes.
6. What are a source and destination adapters?
Answer: A source adaptor basically indicates a source in Data Flow to pull data from. The source adapter uses a connection manager to connect to a source and along with it, you can also specify the query method and query to pull data from the source.
Similar to a source adaptor, the destination adapter indicates a destination in the Data Flow to write data to. Again like the source adapter, the destination adapter also uses a connection manager to connect to a target system and along with that you also specify the target table and writing mode, i.e. write one row at a time or do a bulk insert as well as several other properties.
Please note, the source and destination adapters can both use the same connection manager if you are reading and writing to the same database.
7. What is the Data Path and how is it different from a Precedence Constraint?
Answer: Data Path is used in a Data Flow task to connect to different components of a Data Flow and show the transition of the data from one component to another. A data path contains the Meta information of the data flowing through it, such as the columns, data type, size, etc. When we talk about differences between the data path and precedence constraint; the data path is used in the data flow, which shows the flow of data. Whereas the precedence constraint is used in control flow, which shows control flow or transition from one task to another task
8. What is SSIS event logging?
Answer: Like any other modern programming language, SSIS also raises different events during package execution life cycle. You can enable or write these events to trace the execution of your SSIS package and its tasks. You can also write your custom message as a custom log. You can enable event logging at the package level as well as at the tasks level. You can also choose any specific event of a task or a package to be logged. This is essential when you are troubleshooting your package and trying to understand a performance problem or root cause of a failure. Check out this tip about Custom Logging in SQL Server Integration Services SSIS.
9. What is the Logging Mode property?
Answer: SSIS packages and all of the associated tasks or components have a property called Logging Mode. This property accepts three possible values: Enabled – to enable logging of that component, Disabled – to disable logging of that component and Use Parent Setting – to use parent’s setting of that component to decide whether or not to log the data.
10. Can we add our custom code in SSIS?
Answer: To add your own code:-
1. In the control flow tab, drag and drop Script Task from the toolbox.
2. Double click on script task to open and select edit to open Script task editor.
3. In the script task editor, there are 3 main properties
- i.) General – Here you can specify the name and description
- ii.) Script – through this we can add our code by clicking on Design Script button. The scripting language present is VB.Net only.
- iii.) Expression.
11. Explain the use of containers in SSIS and also their types?
Answer: Containers can be defined as objects that store one or more tasks. The primary purpose of the container is grouping logically related tasks. Once the task is placed into the containers, we can perform various operations such as looping on container level until the desired criterion is met. Nesting of the container is allowed. The container is placed inside the control flow.
There are 4 types of Container:-
- Task Host container- Only one task is placed inside the container. This is the default container.
- Sequence Container – This container can be defined as a subset of package control flow.
- For loop container – Allows looping based on condition. Runs a control flow till the condition is met.
- for each loop container – Loop through container-based on the enumerator.
12. Error Handling in SSIS?
Answer: An error handler allows us to create flows to handle errors in the package is quite an easy way. Through event handler tab, we can name the event on which we want to handle errors and the task that needs to be performed when such an error arises. We can also add sending mail functionality in event of any error through SMTP Task in the Event handler. This is quite useful in event of any failure in office non-working hours. In the Data flow, we can handle errors for each connection through following failure path or red arrow.
13. Explain why variables called the most powerful component of SSIS?
Answer: Variable allows us to dynamically control the package at runtime. Example: You have some custom code or script that determines the query parameter’s value. Now, we cannot have a fixed value for the query parameter. In such scenarios, we can use variables and refer the variable to the query parameter. We can use variables for like:
- Updating the properties at runtime,
- Populating the query parameter value at runtime,
- used in a script task,
- Error handling logic and
- with various looping logic.
14. What is precedence constraint?
Answer: A precedence constraint is a link between 2 control flow tasks and lays down the condition on which the second task is run. They are used to control the workflow of the package. There are 3 kinds of precedence constraint – success (green arrow), failure (red arrow) or Completion script task (blue arrow). By default, when we add 2 tasks, it links by a green arrow. The way the precedence constraint is evaluated can be based on the outcome of the initial task. Also, we can add expression to evaluate such an outcome. Any expression that can be judged as true or false can be used for such purpose. The precedence constraint is very useful in error handling in SSIS package.
15. How can you ensure your package works when you move it to another server?
Answer: In SSIS 2005 till 2008 R2 (also exists in 2012) we can use Package Configuration which is commonly used with SQL Server table (or XML) and environment variable. To be honest Package Configuration was a pain in the next and I much prefer SSIS 2012 Project Deployment Model with Parameters and SSIS Catalog. For more info visit our SSIS 2012 Configuration Tutorial.
16. What is the data flow buffer?
Answer:SSIS operates using buffers which is kind of an in-memory virtual table to hold data. We are able to set max rows/size but this gets more interesting with blocking transformations.
17. What is the difference between Union All and Merge Transformation?
Answer: Union All and Merge are similar but Merge requires sorted input and accepts only 2 inputs where we Union All can accept more than 2.
18. How to send an SSRS report from SSIS?
Answer: Often there is a requirement to be able to send an SSRS report in Excel, PDF or another format to different users from an SSIS package once it has finished performing a data load. In order to do this, first, you need to create a subscription to the report. You can create an SSRS report subscription from Report Manager. At the report subscription, you can mention the report format and the email address of the recipient. When you create a schedule for the SSRS report, a SQL Server Agent Job will be created. From the SSIS, by using sp_start_job and passing the relevant job name you can execute the SSRS report subscription.
19. What is Attunity Driver and why do we user in SSIS?
Answer: Attunity provides 2 high-speed connectors. One for Oracle and one for Teradata. They have been selected by Microsoft to be included with SQL Server 2008 Integration Services (SSIS) SQL 2008 Enterprise Edition. These drivers are highly optimized and very easy to use
20. How checkpoint works in for loop?
Answer: For each Loop, the container is another atomic unit of work that can be restarted. However, the checkpoint file does not contain information about the work completed by the child containers, and the for each Loop container and its child containers run again when the package restarts.