SQL Server Integration Service is often termed as SSIS , after installing the SSIS service, you get some set of comment that allows you to design complex dataflow retrieving data from various data sources, then filtering those data and pushing to SQL service database and other destinations,
Here we see some commonly asked SSIS interview questions for beginners and experienced developers.
What is the RetainSameConnection property and what is the impact?
Whenever a task uses a connection manager to connect to source or destination database, a connection is opened and closed with the execution of that task. Sometimes you might need to open a connection, execute multiple tasks and close it at the end of the execution. This is where RetainSameConnection property of the connection manager might help you. When you set this property to TRUE, the connection will be opened on first time it is used and remain open until execution of the package completes.
What are SSIS Connection Managers?
When we talk of integrating data, we are actually pulling data from different sources and writing it to a destination. But how do you get connected to the source and destination systems? This is where the connection managers come into the picture. Connection manager represent a connection to a system which includes data provider information, the server name, database name, authentication mechanism, etc. For more information check out the SQL Server Integration Services (SSIS) Connection Managers and Connection Managers in SQL Server 2005 Integration Services SSIS tips.
What is ETL ?
ETL stands for Extraction, Transformation and Loading of Data
What are a source and destination adapters?
A source adaptor basically indicates a source in Data Flow to pull data from. The source adapter uses a connection manager to connect to a source and along with it you can also specify the query method and query to pull data from the source.
Similar to a source adaptor, the destination adapter indicates a destination in the Data Flow to write data to. Again like the source adapter, the destination adapter also uses a connection manager to connect to a target system and along with that you also specify the target table and writing mode, i.e. write one row at a time or do a bulk insert as well as several other properties.
What is Data Transformation?
The data transformation stage applies a series of rules or functions to the extracted data from the source to derive the data for loading into the end target.
What kind of containers can you use with SSIS packages?
There are three types of containers: sequence, for loops and for each loops.
A sequence container is a simple way to group similar tasks together. Think of a sequence container as an organization container for more complex SSIS packages.
A for loop container is what you can use to execute your tasks to a certain number of times. For example, you need to update records ten times, you can place the task that updates the records inside this for loop container and specifies 10 as the end of the loops. by using the for loop container, you don't have to create either ten different packages to do the same task or have to run the whole package ten times when you schedule your job.
A for each loop container will be useful when you don’t know a head of time how many times a task should perform. for instance, let’s say that you want to delete all the files inside a folder, but you don’t know how many files are there at any particular time, by using for each loop, it can go through the collection of files and delete them for you, after the collection is emptied out, it knows that when it should stop.
What is Manifest file in SSIS?
Manifiest file is the utility which can be used to deploy the package using wizard on file system and SQL Server database.
What is data Profiling task?
Data profiling is the process of analyzing the source data to better understand
what condition the data is in, in terms of cleanliness, patterns,
numbers or nulls, and so on. data profiling task usually be used at the beginning
of the development cycle to support the design of the destination database schema.
Note that this task is not used when you develop the normal recurring ETL packages.
What is the multicast Transformation in SSIS?
The Multicast transform, as the name implies, can send single data input to multiple output paths easily. You may want to use this transformation to send a path to multiple destinations sliced in different ways. The multicast transformation is similar to the Split Transformation because both send data to multiple outputs. However, you can not specify the conditions for which part of the data will be in which output in the Multicast transformation.
Difference between Merge and Union All?
The Merge transformation can merge data from two paths or two Data Sources into a single output. The Transform is useful when you wish to break out your Data Flow into a path that handles certain errors and then merge it back into the main Data Flow downstream after the errors handling is done. .
The Union All Transformation works much the same way as the Merge Transformation, but it does not require the data to be sorted. It takes the outputs from multiple sources or transforms and combines them
What are the different SSIS log providers?
There are several places where you can log execution data generated by an SSIS event log:
- SSIS log provider for Windows Event Log
- SSIS log provider for Text files
- SSIS log provider for SQL Profiler
- SSIS log provider for XML files
- SSIS log provider for SQL Server, which writes the data to the msdb.
What is an SSIS breakpoint? How do you configure / disable or delete it?
A breakpoint allows you to pause the execution of the package in Business Intelligence Development Studio during development or when troubleshooting an SSIS Package. We can right click on the task in control flow, click on Edit Breakpoint menu and from the Set Breakpoint window, we can specify when we want execution to be halted/paused. (For example OnPreExecute, OnPostExecute, OnError events, etc.)
To toggle a breakpoint, delete all breakpoints and disable all breakpoints go to the Debug menu and click on the respective menu item. You can event specify different conditions to hit the breakpoint as well
Different between Control Flow and Data Flow
- Data Flow is the subset of control flow
- There will be only one control flow while multiple dataflow can exists
- Data flow cannot work without a control flow
- Control flow is for designing the flow of the package. Data flow is for ETL process