Monday, December 24, 2018

Toad Shortcuts Keys

Shortcut Description
F1 Windows Help File
F2 Toggle Output Window
Shift+F2 Toggle Data Grid Window
F3 Find Next Occurrence
Shift+F3 Find Previous Occurrence
F4 Describe Table, View, Procedure, Function
F5 Execute SQL as a script
F6 Toggle between SQL Editor and Results Panel
F7 Clear All Text
F8 Recall Previous SQL Statement
F9 Execute Statement
Ctrl+F9 Set Code Execution Parameters
Shift+F9 Execute Current SQL statement at Cursor
F10 or right-click Pop-up Menu
Ctrl+F12 External Editor, Pass Contents
Ctrl+A Select All Text
Ctrl+C Copy
Ctrl+E Execute Explain Plan on the Current Statement
Ctrl+F Find Text
Ctrl+G Goto Line
Ctrl+L Convert Text to Lowercase
Ctrl+M Make Code Statement
Ctrl+N Recall Named SQL Statement
Ctrl+O Open a Text File
Ctrl+P Strip Code Statement
Ctrl+R Find and Replace
Ctrl+S Save File
Shift+Ctrl+S Save File As
Ctrl+T Columns Drop-down
Shift+Ctrl+R Alias Replacement
Shift+Ctrl+T Columns Drop-Down no alias
Ctrl+Spacebar Code Templates
Ctrl+U Converts Text to Uppercase
Ctrl+V Paste
Ctrl+X Cut
Ctrl+Z Undo Last Change
Ctrl+. Display Pop-up List of Matching Table Names
Shift+Ctrl+Z Redo Last Undo
Alt+Up Arrow Display Previous Statement
Alt+Down Arrow Display Next Statement (After Alt+Up Arrow)
Ctrl+Home In the data grid: goes to the top of the record set
Ctrl+End In the data grid: goes to the end of the record set
Ctrl+Tab Cycles through the Collection of MDI Child Windows

Tuesday, November 6, 2018

BULK COLLECT & FORALL vs. CURSOR & FOR LOOP in oracle

SQL> create table sarlakg (owner varchar2(30), name varchar2(30), type varchar2(19));

Table created.

 Example CURSOR_FOR_OPEN_QUERY

SQL> set timing on;
SQL> CREATE OR REPLACE PROCEDURE CURSOR_FOR_OPEN_QUERY
  2   IS
  3   l_sOwner VARCHAR2(30);
  4   l_sName VARCHAR2(30);
  5   l_sType VARCHAR2(19);
  6   CURSOR cur IS SELECT owner, object_name name, object_type type FROM all_objects;
  7   BEGIN
  8   dbms_output.put_line('Before CURSOR OPEN: ' || systimestamp);
  9   OPEN cur;
 10   dbms_output.put_line('Before LOOP: ' || systimestamp);
 11   LOOP
 12   FETCH cur INTO l_sOwner, l_sName, l_sType;
 13   IF cur%NOTFOUND THEN
 14   EXIT;
 15   END IF;
 16   INSERT INTO sarlakg values (l_sOwner, l_sName, l_sType);
 17   END LOOP;
 18   CLOSE cur;
 19   dbms_output.put_line('After CURSOR CLOSE: ' || systimestamp);
 20   COMMIT;
 21   END;
 22  /

Procedure created.

Elapsed: 00:00:00.47
SQL> exec CURSOR_FOR_OPEN_QUERY();

PL/SQL procedure successfully completed.

Elapsed: 00:00:05.80
SQL> select count(*) from sarlakg;

  COUNT(*)
----------
     55731

Elapsed: 00:00:00.00
SQL>


SQL> truncate table sarlakg;

Table truncated.

Elapsed: 00:00:00.15

 Example FOR_QUERY

SQL> CREATE OR REPLACE PROCEDURE FOR_QUERY
  2   IS
  3   BEGIN
  4   dbms_output.put_line('Before CURSOR: ' || systimestamp);
  5   FOR cur IN (SELECT owner, object_name name, object_type type FROM all_objects) LOOP
  6   INSERT INTO sarlakg values (cur.owner, cur.name, cur.type);
  7   END LOOP;
  8   dbms_output.put_line('After CURSOR: ' || systimestamp);
  9   COMMIT;
 10   END;
 11  /

Procedure created.

Elapsed: 00:00:00.13
SQL> exec FOR_QUERY();

PL/SQL procedure successfully completed.

Elapsed: 00:00:02.58
SQL> select count(*) from sarlakg;

  COUNT(*)
----------
     55732

Elapsed: 00:00:00.01
 /


 SQL> truncate table sarlakg;

Table truncated.

Elapsed: 00:00:00.15

 Example BULK_COLLECT_QUERY


 SQL> CREATE OR REPLACE PROCEDURE BULK_COLLECT_QUERY
  2   IS
  3   TYPE sOwner IS TABLE OF VARCHAR2(30);
  4   TYPE sName IS TABLE OF VARCHAR2(30);
  5   TYPE sType IS TABLE OF VARCHAR2(19);
  6   l_sOwner sOwner;
  7   l_sName sName;
  8   l_sType sType;
  9   BEGIN
 10   dbms_output.put_line('Before Bulk Collect: ' || systimestamp);
 11   SELECT owner, object_name, object_type
 12   BULK COLLECT INTO l_sOwner, l_sName, l_sType
 13   FROM all_objects;
 14   dbms_output.put_line('After Bulk Collect: ' || systimestamp);
 15   --
 16   FORALL indx IN l_sName.FIRST..l_sName.LAST
 17   INSERT INTO sarlakg values (l_sOwner(indx), l_sName(indx), l_sType(indx));
 18   --
 19   dbms_output.put_line('After FORALL: ' || systimestamp);
 20   COMMIT;
 21   END;
 22  /

Procedure created.

Elapsed: 00:00:00.04
SQL> exec BULK_COLLECT_QUERY();

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.68
SQL>


SQL> select count(*) from sarlakg;

  COUNT(*)
----------
     55732

Elapsed: 00:00:00.01

Tuesday, October 16, 2018

ORA-30009: Not enough memory for CONNECT BY operation


ORA-30009: Not enough memory for CONNECT BY operation
Cause: The memory size was not sufficient to process all the levels of the hierarchy specified by the CONNECT BY clause.
Action: In WORKAREA_SIZE_POLICY=AUTO mode, set PGA_AGGREGATE_TARGET to a reasonably larger value. Or, in WORKAREA_SIZE_POLICY=MANUAL mode, set SORT_AREA_SIZE to a reasonably larger value.



SQL> select count(*) from dual connect by level <= 100000000;
select count(*) from dual connect by level <= 100000000
                     *
ERROR at line 1:
ORA-30009: Not enough memory for CONNECT BY operation



SQL>select count(*) from
  2  (select level from dual connect by level <= 10000),
  3  (select level from dual connect by level <= 10000);

  COUNT(*)
----------
 100000000


Order By and Null values

SQL*Loader Concepts

SQL*Loader Features

Bad,log,discard files will be generated automatically by Oracle
Load data across a network. This means that you can run the SQL*Loader client on a different system from the one that is running the SQL*Loader server.
Specify the character set of the data.
Load data from multiple datafiles during the same load session.

Load data into multiple tables during the same load session.

Manipulate the data before loading it, using SQL functions.

Generate unique sequential key values in specified columns.

Use the operating system's file system to access the datafiles.



SQL*Loader Command-Line Reference
userid -- Oracle username/password
      control -- Control file name
          log -- Log file name
          bad -- Bad file name
         data -- Data file name
      discard -- Discard file name
   discardmax -- Number of discards to allow
                (Default all)
         skip -- Number of logical records to skip
                (Default 0)
         load -- Number of logical records to load
                (Default all)
       errors -- Number of errors to allow
                (Default 50)
         rows -- Number of rows in conventional path bind array
                or between direct path data saves
                (Default: Conventional Path 64, Direct path all)
     bindsize -- Size of conventional path bind array in bytes
                (System-dependent default)
       silent -- Suppress messages during run
                (header, feedback, errors, discards, partitions, all)
       direct -- Use direct path
                (Default FALSE)
      parfile -- Parameter file: name of file that contains
                parameter specifications
     parallel -- Perform parallel load
                (Default FALSE)
     readsize -- Size (in bytes) of the read buffer
         file -- File to allocate extents from


SQL*Loader Parameters

Parameters can be grouped together in a parameter file. You could then specify the name of the parameter file on the command line using the PARFILE parameter.

Certain parameters can also be specified within the SQL*Loader control file by using the OPTIONS clause

SQL*Loader Control File
Contains mapping information between file & the table
CONTROL specifies the name of the control file that describes how to load data.
If a file extension or file type is not specified, it defaults to CTL. If omitted, SQL*Loader prompts you for the file name.

Input Data and Datafiles

Data Conversion and Datatype Specification

Discarded and Rejected Records

Log File and Logging Information

Conventional Path Loads, Direct Path Loads, and External Table Loads

Partitioned Object Support

Application Development: Direct Path Load API

Wednesday, October 10, 2018

What's an explain plan?

An explain plan is a representation of the access path that is taken when a query is executed within Oracle.

Query processing can be divided into 7 phases:

[1] Syntactic Checks the syntax of the query
[2] Semantic Checks that all objects exist and are accessible
[3] View Merging Rewrites query as join on base tables as opposed to using views
[4] Statement
     Transformation Rewrites query transforming some complex constructs into simpler ones where appropriate (e.g. subquery merging, in/or transformation)
[5] Optimization Determines the optimal access path for the query to take. With the Rule Based Optimizer (RBO) it uses a set of heuristics to determine access path. With the Cost Based Optimizer (CBO) we use statistics to analyze the relative costs of accessing objects.
[6] QEP Generation QEP = Query Evaluation Plan
[7] QEP Execution QEP = Query Evaluation Plan


Join Types

Sort Merge Join (SMJ)
Nested Loops (NL)
Hash Join



Sort Merge Join

SQL> explain plan for
select /*+ ordered */ e.deptno,d.deptno
from emp e,dept d
where e.deptno = d.deptno
order by e.deptno,d.deptno;

Query Plan
-------------------------------------
SELECT STATEMENT [CHOOSE] Cost=17
  MERGE JOIN
    SORT JOIN
      TABLE ACCESS FULL EMP [ANALYZED]
    SORT JOIN
      TABLE ACCESS FULL DEPT [ANALYZED]



Nested Loops

SQL> explain plan for
select a.dname,b.sql
from dept a,emp b
where a.deptno = b.deptno;

Query Plan
-------------------------
SELECT STATEMENT [CHOOSE] Cost=5
  NESTED LOOPS
    TABLE ACCESS FULL DEPT [ANALYZED]
    TABLE ACCESS FULL EMP [ANALYZED]

Hash Join

SQL> explain plan for
select /*+ use_hash(emp) */ empno
from emp,dept
where emp.deptno = dept.deptno;

Query Plan
----------------------------
SELECT STATEMENT  [CHOOSE] Cost=3
  HASH JOIN
    TABLE ACCESS FULL DEPT
    TABLE ACCESS FULL EMP





SQL> set autotrace traceonly explain;
SQL> select * from test;

Execution Plan
----------------------------------------------------------
Plan hash value: 1357081020

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    20M|    95M|  8480   (3)| 00:01:42 |
|   1 |  TABLE ACCESS FULL| TEST |    20M|    95M|  8480   (3)| 00:01:42 |
--------------------------------------------------------------------------

SQL> select /*+ PARALLEL(4) */ *
  2  from test
  3  /

Execution Plan
----------------------------------------------------------
Plan hash value: 3388271637

--------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          |    20M|    95M|  2351   (3)| 00:00:29 |        |      |         |
|   1 |  PX COORDINATOR      |          |       |       |            |          |        |      |         |
|   2 |   PX SEND QC (RANDOM)| :TQ10000 |    20M|    95M|  2351   (3)| 00:00:29 |  Q1,00 | P->S | QC (RAND)  |
|   3 |    PX BLOCK ITERATOR |          |    20M|    95M|  2351   (3)| 00:00:29 |  Q1,00 | PCWC |         |
|   4 |     TABLE ACCESS FULL| TEST     |    20M|    95M|  2351   (3)| 00:00:29 |  Q1,00 | PCWP |         |
--------------------------------------------------------------------------------------------------------------

Note
-----
   - Degree of Parallelism is 4 because of hint

SQL> select /*+ PARALLEL(2) */ *
  2  from test ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3388271637

--------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          |    20M|    95M|  4703   (3)| 00:00:57 |        |      |         |
|   1 |  PX COORDINATOR      |          |       |       |            |          |        |      |         |
|   2 |   PX SEND QC (RANDOM)| :TQ10000 |    20M|    95M|  4703   (3)| 00:00:57 |  Q1,00 | P->S | QC (RAND)  |
|   3 |    PX BLOCK ITERATOR |          |    20M|    95M|  4703   (3)| 00:00:57 |  Q1,00 | PCWC |         |
|   4 |     TABLE ACCESS FULL| TEST     |    20M|    95M|  4703   (3)| 00:00:57 |  Q1,00 | PCWP |         |
--------------------------------------------------------------------------------------------------------------

Note
-----
   - Degree of Parallelism is 2 because of hint

SQL>












Tuesday, October 9, 2018

how to delete duplicate records from a table in Oracle

4 ways to delete duplicate records Oracle

SQL> select empno,count(empno) from emp21 group by empno
  2  having count(empno)>1;

     EMPNO COUNT(EMPNO)
---------- ------------
      7782            2
      7844            2
      7698            2
      7902            2
      7566            2
      7788            2
      7654            2
      7934            2
      7876            2
      7900            2

10 rows selected.


1. Using rowid

  SQL>        delete from emp21 e1
         where rowid not in
          (select max(rowid) from emp21 e2
           where e1.empno = e2.empno );



2. Using self-join

  SQL> delete from emp
           where rowid not in
          (select max(rowid) from emp group by empno);


3. Using row_number()

SQL> delete from emp21 where rowid in
                   (
                     select rid from
                      (
                        select rowid rid,
                          row_number() over(partition by empno order by empno) rn
                          from emp21
                      )
                    where rn > 1
                  );



4. Using dense_rank()

SQL>               (
                 select rid from
                  (
                    select rowid rid,
                      dense_rank() over(partition by empno order by empno) rn
                      from emp21
                  )
                where rn > 1
               );

Monday, October 8, 2018

sql loader in oracle example



SQL-Loader: The Step by Step Basics - Example




SQL> create table customer_ldr (cid number primary key,cname varchar2(20),
loc varchar2(20),type char(1),load_date date);
 /

Table created.

*******************data.txt*************************

1,prabhakar,bidar,y,sysdate,
2,manju,bang,y,sysdate,
3,daivik,mysore,y,sysdate,


*********************controlfile*******************

 options  ( skip=1 )
load data
infile 'D:\sql_loader\lod\data.txt'
TRUNCATE
into table  customer_ldr
fields terminated by "," optionally enclosed by "#" TRAILING NULLCOLS
(cid ,cname,loc,type, load_date  sysdate)



C:\Users\Admin>sqlldr scott/tiger control='D:\sql_loader\lod\sysdate_ldr.txt' log='D:\sql_loader\lod\log.txt'

SQL*Loader: Release 11.2.0.1.0 - Production on Sun Oct 7 22:21:50 2018

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

Commit point reached - logical record count 4




SQL> select * from customer_ldr;

       CID CNAME                LOC                  T  LOAD_DATE
---------- -------------------- -------------------- - ---------
         1 prabhakar            bidar                y       07-OCT-18
         2 manju                bang                 y        07-OCT-18
         3 daivik               mysore               y       07-OCT-18





Friday, October 5, 2018

Mutating triggers / Mutating errors in triggers




create or replace trigger trig_mut
    before insert on emp10
    for each row
    begin
    insert into emp10(empno,sal) values (:new.empno,:new.sal);
    end;
    /
Trigger created.



SQL> insert into emp10 select * from emp;
insert into emp10 select * from emp
            *
ERROR at line 1:
ORA-04091: table SCOTT.EMP10 is mutating, trigger/function may not see it
ORA-06512: at "SCOTT.TRIG_MUT", line 2
ORA-04088: error during execution of trigger 'SCOTT.TRIG_MUT'


Monday, October 1, 2018

Why I don't need to COMMIT in database trigger?


We can't COMMIT/ROLLBACK in DML triggers because transaction is handled manually after DML statement. However, database triggers seems to be an exception. For example, suppose there's a database trigger:

SQL> CREATE TABLE log (timestamp DATE, operation VARCHAR2(2000));

Table created.

SQL>  CREATE TABLE  honnikery (col1 NUMBER);

Table created.

SQL> create or replace trigger honni_trig
  2      after insert on honnikery
  3      begin
  4      insert into log values(sysdate,'insert on tab1');
  5      commit;
  6      end;
  7  /

Trigger created.

SQL> INSERT INTO honnikery VALUES (1);
INSERT INTO honnikery VALUES (1)
            *
ERROR at line 1:
ORA-04092: cannot COMMIT in a trigger
ORA-06512: at "SCOTT.HONNI_TRIG", line 3
ORA-04088: error during execution of trigger 'SCOTT.HONNI_TRIG'


As workaround, one can use autonomous transactions. Autonomous transactions execute separate from the current transaction.


create or replace trigger honnikery_trig
    after insert on honnikery
declare
  PRAGMA AUTONOMOUS_TRANSACTION;
    begin
    insert into log values(sysdate,'insert on honnikery');
    commit;
    end;

Trigger created.

SQL> INSERT INTO honnikery VALUES (1);
1 row created.


Friday, August 31, 2018

BULK COLLECT & FORALL vs. CURSOR & FOR-LOOP



CREATE OR REPLACE PROCEDURE CURSOR_FOR_OPEN_ORA
 IS
 l_ename VARCHAR2(30);
 l_empno VARCHAR2(30);
 l_sal VARCHAR2(19);
 CURSOR cur IS SELECT ename, empno , sal  FROM emp;
 BEGIN
 dbms_output.put_line('Before CURSOR OPEN: ' || systimestamp);
 OPEN cur;
 dbms_output.put_line('Before LOOP: ' || systimestamp);
 LOOP
 FETCH cur INTO l_ename, l_empno, l_sal;
 IF cur%NOTFOUND THEN
 EXIT;
 END IF;
 INSERT INTO emp1(ename,empno,sal) values (l_ename, l_empno, l_sal);
 END LOOP;
 CLOSE cur;
 dbms_output.put_line('After CURSOR CLOSE: ' || systimestamp);
 COMMIT;
 END;
/

 Procedure created.

SQL> exec CURSOR_FOR_OPEN_ORA();
Before CURSOR OPEN: 31-AUG-18 03.19.22.033000000 PM +05:30
Before LOOP: 31-AUG-18 03.19.22.033000000 PM +05:30
After CURSOR CLOSE: 31-AUG-18 03.19.22.033000000 PM +05:30

PL/SQL procedure successfully completed.

Thursday, August 30, 2018

Informatica Versions History

  • Informatica Powercenter 4.1
  • Informatica Powercenter 5.1
  • Powercenter Informatica 6.1.2
  • Informatica Powercenter 7.1.2
  • Informatica Powercenter 8.1
  • Informatica Powercenter 8.5
  • Informatica Powercenter 8.6
  • Informatica Powercenter 9.1
  • Informatica Powercenter 10

Tuesday, August 28, 2018

How to find top three highest salary in emp table in oracle?



select empno,salary from emp e
     where 3 > ( Select count(salary) from emp
      where e.salary < salary )


Another way :

    select * from
    (
    select empno,salary,
    Rank() over(order by salary desc) as rank from emp )
    where Rank <= 3;

Another Way :

    select * from
    (
    select empno,salary from emp
    order by salary desc
    )
    where rownum <= 3;


Saturday, August 25, 2018

Informatica power center

Informatica power center 
constitutes of three main components:

Client tools: installed on developer machines.
Power Centre repository: place to store metadata for an application
Power center server: server to perform data executions
With growing customer base, Informatica is continuously trying to leverage its data integration solutions. This tool has in built powerful mapping templates to help manage data in an efficient manner.



There are mainly 4 steps in the Informatica ETL process,


  1. Extract or Capture
  2. Scrub or Clean
  3. Transform
  4. Load and Index

Features of Informatica ETL: 



For all the Data integration and ETL operations, Informatica has provided us with Informatica PowerCenter. Let us now see some key features of Informatica ETL:
  • Provides facility to specify a large number of transformation rules with a GUI.
  • Generate programs to transform data.
  • Handle multiple data sources.
  • Supports data extraction, cleansing, aggregation, reorganisation, transformation, and load operations.
  • Automatically generates programs for data extraction.
  • High-speed loading of target data warehouses.

Data warehouse design methods



In addition to Inmon's top-down approach to data warehouses and Kimball's bottom-up method, some organizations have also adopted hybrid options.
  • Top-down approach: Inmon's method calls for building the data warehouse first. Data is extracted from operational and possibly third-party external systems and may be validated in a staging area before being integrated into a normalized data model. Data marts are created from the data stored in the data warehouse.


  • Bottom-up method: Kimball's data warehousing architecture calls for dimensional data marts to be created first. Data is extracted from operational systems, moved to a staging area and modeled into a star schema design, with one or more fact tables connected to one or more dimensional tables. The data is then processed and loaded into data marts, each of which focuses on a specific business process. Data marts are integrated using a data warehouse bus architecture to form an enterprise data warehouse.



  • Hybrid method: Hybrid approaches to data warehouse design include aspects from both the top-down and bottom-up methods. Organizations often seek to combine the speed of the bottom-up approach with the integration achieved in a top-down design.

Data warehouses vs. databases vs. data lakes

Databases and data lakes are often confused with data warehouses, but there are important differences.
While data warehouses typically store data from multiple sources and utilize predefined schemas designed for data analytics, a database is generally used to capture and store data from a single source, such as a transactional system, and its schema is normalized. Databases aren't designed to run across very large data sets.

Sunday, August 19, 2018

Types of Data Warehouse

Information processing, analytical processing, and data mining are the three types of data warehouse applications that are discussed below:


  1.  Information Processing – A data warehouse allows to process the data stored in it. The data can be processed by means of querying, basic statistical analysis, reporting using crosstabs, tables, charts, or graphs. 
  2.  Analytical Processing – A data warehouse supports analytical processing of the information stored in it. The data can be analyzed by means of basic OLAP operations, including slice-and-dice, drill down, drill up, and pivoting. 
  3.  Data Mining - Data mining supports knowledge discovery by finding hidden patterns and associations, constructing analytical models, performing classification and prediction. These mining results can be presented using visualization tools.  

OLTP vs OLAP: What's the Difference?


What is OLTP?

Online transaction processing shortly known as OLTP supports transaction-oriented applications in a 3-tier architecture. OLTP administers day to day transaction of an organization.

The primary objective is data processing and not data analysis

examples of OLTP system are:



Online banking
Online airline ticket booking
Sending a text message
Order entry
Benefits of OLTP method

It administers daily transactions of an organization.
OLTP widens the customer base of an organization by simplifying individual processes.


What is OLAP?



Online Analytical Processing, a category of software tools which provide analysis of data for business decisions. OLAP systems allow users to analyze database information from multiple database systems at one time.

The primary objective is data analysis and not data processing.

Example of OLAP

Any Datawarehouse system is an OLAP system. Uses of OLAP are as follows

A company might compare their mobile phone sales in September with sales in October, then compare those results with the with another location which may be stored in a sperate database.
Amazon analyzes purchases by its customers to come up with a personalized homepage with products which likely interest to their customer.
Benefits of using OLAP services
OLAP creates a single platform for all type of business analytical needs which includes planning, budgeting, forecasting, and analysis.
The main benefit of OLAP is the consistency of information and calculations.
Easily apply security restrictions on users and objects to comply with regulations and protect sensitive data.



Data Warehouse Concepts, Architecture and Components


1.

What is Data warehouse?

Data warehouse is an information system that contains historical and commutative data from single or multiple sources. It simplifies reporting and analysis process of the organization.

It is also a single version of truth for any company for decision making and forecasting.

In this tutorial, you will learn-

What is Data warehouse?
Characteristics of Data warehouse
·         Subject-Oriented

·         Integrated

·         Time-Variant

·         Non-volatile

Data Warehouse Architectures
Datawarehouse Components
·         Data Warehouse Database

·         Sourcing, Acquisition, Clean-up and Transformation Tools (ETL)

·         Metadata

·         Query Tools

·         Data warehouse Bus Architecture

Data Marts
Data warehouse Architecture Best Practices

Friday, August 17, 2018

List of ETL tools

List of ETL tools - 

  1. Oracle Warehouse Builder (OWB)
  2. SAP Data Services
  3. IBM Infosphere Information Server
  4. SAS Data Management
  5. PowerCenter Informatica
  6. Elixir Repertoire for Data ETL
  7. Data Migrator (IBI)
  8. SQL Server Integration Services (SSIS)
  9. Talend Studio for Data Integration
  10. Sagent Data Flow
  11. Actian DataConnect
  12. Open Text Integration Center
  13. Oracle Data Integrator (ODI)
  14. Cognos Data Manager
  15. CloverETL
  16. Centerprise Data Integrator
  17. IBM Infosphere Warehouse Edition
  18. Pentaho Data Integration
  19. Adeptia Integration Server
  20. Syncsort DMX
  21. QlikView Expressor
  22. Relational Junction ETL Manager (Sesame Software)

Monday, August 13, 2018

The Python topics Go From Beginner To Advanced!

What Will I Learn?
  • Understand the basic as well as core concepts of Python programming language
  • Acquire enhanced skills with Python graphics

  • Design, create, develop, and code Python games

  • Read and write external data files using Python
  • Implement functions and call built-in Python functions
  • Debug an entire program and handle errors in Python
  • Implement basic Python structures such as if statements, loops, and functions

Learn Data Science

learn Data Science in R from Scratch

Table of Contents

  1. Basics of R Programming for Data Science
    • Why learn R ?
    • How to install R / R Studio ?
    • How to install R packages ?
    • Basic computations in R
  2. Essentials of R Programming
    • Data Types and Objects in R
    • Control Structures (Functions) in R
    • Useful R Packages
  3. Exploratory Data Analysis in R
    • Basic Graphs
    • Treating Missing values
    • Working with Continuous and Categorical Variables
  4. Data Manipulation in R
    • Feature Engineering
    • Label Encoding / One Hot Encoding
  5. Predictive Modeling using Machine Learning in R
    • Linear Regression
    • Decision Tree
    • Random Forest

Let’s get started !
Note: The data set used in this article is from Big Mart Sales Prediction.

1. Basics of R Programming

Why learn R ?

I don’t know if I have a solid reason to convince you, but let me share what got me started. I have no prior coding experience. Actually, I never had computer science in my subjects. I came to know that to learn data science, one must learn either R or Python as a starter. I chose the former. Here are some benefits I found after using R:
  1. The style of coding is quite easy.
  2. It’s open source. No need to pay any subscription charges.
  3. Availability of instant access to over 7800 packages customized for various computation tasks.
  4. The community support is overwhelming. There are numerous forums to help you out.
  5. Get high performance computing experience ( require packages)
  6. One of highly sought skill by analytics and data science companies.
There are many more benefits. But, these are the ones which have kept me going. If you think they are exciting, stick around and move to next section. And, if you aren’t convinced, you may like Complete Python Tutorial from Scratch.


More :https://www.analyticsvidhya.com/blog/2016/02/complete-tutorial-learn-data-science-scratch/