0% found this document useful (0 votes)
863 views2 pages

Abinitio Questions

Abinitio is an ETL tool used to extract, transform, and load data from various sources into data warehouses. It supports batch and continuous processing as well as data movement and transformation. Abinitio uses a 2-tier architecture with a graphical development environment, cooperating operating system, and enterprise metadata environment to transform large volumes of data quickly and with high performance across many database types. The graphical development environment is used to design graphs that define the ETL processes, which are then run on the cooperating operating system. Metadata is stored in the enterprise metadata environment.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
863 views2 pages

Abinitio Questions

Abinitio is an ETL tool used to extract, transform, and load data from various sources into data warehouses. It supports batch and continuous processing as well as data movement and transformation. Abinitio uses a 2-tier architecture with a graphical development environment, cooperating operating system, and enterprise metadata environment to transform large volumes of data quickly and with high performance across many database types. The graphical development environment is used to design graphs that define the ETL processes, which are then run on the cooperating operating system. Metadata is stored in the enterprise metadata environment.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Abinitio: Abinitio basically a data ware house ETL tool , Extract, Transform, Load Extracting data from the

various
sources transform the data as per the business requirement and load the data into DWH
Abinitio can support critical application llike : DWH, Batch processing , Continuous Processing , Data movement , data
transformation etc…
Benefits: transforms huge volumes of data effectively, very quick, high performance ,huge options of data bases
(oracle,sl, plsql,postgre sql…) checkpoint, debugging and restartability feature and available
Architecture: 2 Tier architecture GDE, Co>operating system, EME(enterprise metadata environment)
GDE: garphical development environment—GDE is graphical application for developers which is used for designing
and running abinitio graphs . it can communicate wit Co-Os
CO-OS: co-os is a program provided by abibitio which operates on the top of the native OS and is a base for all
abinitio process.
EME: Eme is a repository which holds unix like structure and is used for metadata management in abinitio
GDE we have 2 sections : 1. Component organizer, 2.sandbox
Component organizer: lot’s of components present in the abinitio components basically input file output file sort,filtter
by expression, normalize, de-normalize,
Serial file : In serial data will process one-by-one first record is comnpleted after second record completed ……
Parallel File: multi files ,data filffed into partitions all the partitions data flowing parallel in abinitio ,increases performan
Parallelism : Component pallelism: is used by a graph with multifile process executing simultaneously on separate
data , Data Parallelism: is usd by a graph that works with data divided into segments and operates on each segment
respectively. Pipeline parallelism: A graph with multifile components running simultaneously on the same data
LAYOUT: layout determines the location of a resource , a layout either serial or parallel, A serial layout specifies a
single computer and a single directory on that computer, A parallel layout specifies a multiple computers with multiple
directories across the computers.Control Center: it is a tool for scheduling job, job is travelled in control center
Plan and PSET testing -- Plan is the combination of PSETS which are running in the Sequential /Parallel Manner.
Majorly Plan has end to end Process embedded into Single Plan.
PSET is the Parameter Set which is defined on the Generic Graph with different Parameters. We use the same graph
to load Different tables by passing different parameters and creating different PSETS.

1.Reformat changes the record format of data records by dropping fields, or by using DML expressions to add
fields, combine fields, or transform the data in the records. By default reformat has got one output port but
incrementing value of count parameter number. – Count, select,transform, output index, output indexs
2. output index: Always returns a integers, output indexes- returns a vector (0,1)
3.Rollup Transfom functions: It allows the users to group the records on certain field values ,it is a multi stage
function and contains Initilize,2.rollup,3.finalize functions
4.Filter by expression : Filter by expression is used to filter records based on the DML expressions.. parameters –
select expression, reject thersold, Logging.
5. m_dump syntax: m_dump <dml_path><inputfile.path>
6.M_Queue – Multi queue are the mutil way queues (8/4 way) it works on FIFO conceptit is used as a load ready files
7.in MFS if we remove one of the data partition file will through error- Yes it through error f data is deleted (or truncated) in a
partition, ab-initio does not throw the error. If a partition itself is deleted, ab-initio is going to throw the error.
8.du command: du command is used to estimate file space usage space used under a particular directory or files on
a file system,
DF: df command is used to display the amount of available disk space for file system.
9.Teradata utilities : Transferring a large amount of data can be done by using the various teradata utilities i.e-
BTEQ(basic Teradata Queues), fastload,multiload,Tpump Transfor data from host to Terradata
10.Primary index is used to specify where the data resides in Teradata
11.How to dalete empty line from unix file- grep.file .txt
12.Abinition RC--- In abinition RC file ,EME connection details are there, connection information is there for
connecting one server to another server
13.SFTP( secure file transfer using SSH), SCP(Secure Copy)--- These general commands to send file from one
environm to other environment, SCP is a protocol that allows trasferring file securely from a local host to a remote
host, SFTP is a protocol that allows file accessing transferring and managing over a reliable data stream which is
faster than SCP
14.AbinitioQueue- Queue is a data strcture where we store data we can read and write the data by using subscribe
and publish component and queue helps us to store the records in sequence of files. Abiinitio Queue is a FIFO(First
in F out) They provide record-based persistence.,Publishers write data to the queue.
15.Sandbox : Sandbox – Sandbox is the local copy of the Ab-Initio EME Project confined to a specific User. Multiple
users can have copies of the EME Project in their sandboxes. User can work into their Sandboxes for the same
project at the same time. --, 2 types PublicSB- It’s the one that is visible to other projects, Private: Cannot be
accessible to other projects
16..Rec fie: (Recovery file)- When a graph is run along with it a recovery file is created,because if any failure occur we
can start at that point only.
4TH HIGHEST SALARY: Select min(salary) from (select top3 * from employee order by salary desc)third order by
salary asc--- (2nd HIGHEST –select max(salary) from employee whre salary not in (select max(salary) from employee)
Nth HIGHEST—select * from employee e1 where(n-1)=(select count(distinct(e2.salary)) from employee e2 where
e2.salary>e1.salary), NEW TABLE CREATE: select * into newtable from old table( without data where 1=0;)
DUPLICATE FIND: select * ,count(*) from employee group by empid having count(*)>1 DELETE DUPLICATE:
Delete from(select *,row_number()over(partition by empid order by empid) as rn from employeetable)where rn>1
SUBSTRING: select substring(‘fullname’,1,charindex(‘_’,fullname) as firstname,
substring(‘fullname’,charindex(‘_’,fullname)+1,len(‘fullname)) as lastname from employee
PRIMARY KEY: A PRIMARY KEY constraint uniquely identifies each record in a database table. All columns
participating in a primary key constraint must not contain NULL values
SURROGATE KEY:  the key is generated when a new record is inserted into a table. When a primary key is
generated at runtime, it is called a surrogate key.
FACT TABLE: Fact table basically represents the metrics of a measurements and facts of business process, facts are
linked with dimentions in the table. Addictive facts, Semi addictive facts, Non addictive facts
DIMENSION TABLE: Dimensions are descriptive data which is described by the keys dimensions are organized in the
tables called Dimension Table, Confirm dimension, Junk Dimension
SCD(Slowly Changing Dimension):SCD is a dimension that stores and manages both current and historical
data over time in a data warehouse. It is considered and implemented as one of the most critical ETL tasks
in tracking the history of dimension records.
SCD1: SCD1 the new data overwrites the existing data.  Thus the existing data is lost as it is not stored anywhere else.
SCD2:  Creating another dimension record,  A new record is created with the changed data values and this new record
becomes the current record.
JOINS: Joins clause used to combine 2 or more tables related columns between them . Inner join, left join ,right join ,outer join
UNION & UNION ALL: union operator used to combine result set of 2 or more select statements, every select statement with in
union must have same number of columns & same data types it is not return duplicate values, UNION ALL : All records with dupli
CONSTRAINTS: UNIQUE, NOT NULL, CHECK, DEFAULT, INDEX, PK , Foreign Key
OLTP: Online
transaction processing captures, stores, and processes data from transactions in real time.
OLAP: Online analytical processing uses complex queries to analyze aggregated historical data from OLTP
Smoke testing : Smoke Testing is performed to ascertain that the critical functionalities of the program are working
fine. Smoke testing exercises the entire system from end to end
Sanitary Testing: Sanity testing is done at random to verify that each functionality is working as expected.
Normalization: Normalization is a database design technique which is used to reduce redundant data and unwanted
data or repeated data - 1nf,2nf,3nf
STAR SCHEMA: A star schema contains both dimension tables and fact tables in it.  In star schema each
dimension is surrounded by fact tables.
SNOWFLAKE SCHEMA: A snow flake schema contains all three- dimension tables, fact tables, and sub-
dimension tables. Each dimension is normalized into sub-dimensions.
CHECK-OUT PROCESS –While working on any Project, we need to Check out the code of EME Project path. We
need to check out latest version of the Code into Our Sandbox (to make a local Copy).And then we can start working
on the Project in our Local Copy. Once all the changes are done/ Code modified we can check –In the copy of our
Project (Sandbox) into EME with a different version.
CASE: the case statements goes through conditions and return a value when first condition is met(like if –then-else)
RANK, Dense_Rank, Row_nuber: it is assigns rank to each record in a table it skips the similar values
VIEW: View is virtual table it acts as a actual table the views are not stored in the database, no memory concept is
MATERIALIZED VIEW: The results of a view expression are stored in a database system. It has some store memory
SUB Query : A Subquery is a SQL query within another query. It is a subset of a Select statement whose return
values are used in filtering the conditions of the main query.
Correlated SUB Query: a correlated subquery is a subquery that uses values from the outer query in order to
complete. Because a correlated subquery requires the outer query to be executed first, the correlated subquery must
run once for every row in the outer query. It is also known as a synchronized subquery.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy