VUsolutions Transferred to AchiKhasi.com

From December 2011, this blog www.VUsolutions.blogspot.com is transferred to http://achikhasi.com/vu/ . So, you may visit http://achikhasi.com/vu/ for latest study related help.

Back to home VUsolutions

VUsolutions Fans Club [join us for MORE solutions]

VUsolutions on Facebook

CS614 Assignment No. 3 solution

Saturday, June 11, 2011 Posted In Edit This
Assignment No. 03
SEMESTER Spring 2011
CS614- Data Warehousing


Question # 1:-
How the approach of parallelism is used in Data warehouse and OLAP?

Ans:-
By partitioning a data among a set of processors OLAP queries can be executed in parallel potentially achieving linear speedup and thus significantly improving query response time…..

In recent years the databssase community has experienced a tremendous increase in the availability of new technologies to support efficient storage and retrieval of large volumes of data, namely data warehousing and On-Line Analytical Processing (OLAP) products. Efficient query processing is critical in such an environment, yet achieving quick response times with OLAP queries is still largely an open issue. In this paper we propose a solution approach to this problem by applying parallel processing techniques to a warehouse environment. We suggest an efficient partitioning strategy based on the relational representation of a data warehouse (i.e., star schema). Furthermore, we incorporate a particular indexing strategy, Data Indexes, to further improve query processing times and parallel resource utilization, and propose a preliminary parallel star-join strategy.
A parallel aproch for the data ware house is strong due to inherent nature of such an environment.

Still another appeal for parallelism in datawarehouse in the OLAP system is the logical is a logical design for the warehouse Data warehouses often contain large tables and require techniques both for managing these large tables and for providing good query performance across these large tables. This chapter discusses two key methodologies for addressing these needs: parallelism and partitioning.

Parallelism is certainly an issue if you're using DTS. If you don't specifically set up the package to use parallelism then your only choice is to run multiple packages at once to get parallel processes running. If you
have abstracted the data load in such a way that a single package is used to load multiple files dynamically, then parallelism needs to be considered in order to get the package running multiple threads within the same package.

There are different degrees of how and where parallelism can and should be used. Ensuring that the query processor is setup correctly to handle parallel processing of the data load is also a consideration. One can also have multiple servers running to get the data loaded across federated servers. This is also considered to be parallelism

Question #2:-

1) How the parallel star join works that is discussed in this research article.

Ans:-
To compute the star join, we first find out the rows of the fact table that will be participating in the final
cube grouping. Since restriction predicates reduce the number of rows in the final result table, we utilise
them first. The following query shows how restriction predicates limit the number of rows in the result
table:

SELECT Year, State, Product_Name, SUM (Quantity) AS ‘Total Quantity’
FROM Product P, Date D, Sales S, Region R
WHERE P.Product_No = S. Product_No AND D.Date_key = S.Date_Key AND
R.Region_ID = S. Region_ID AND
D.Year IN (1998, 1999) AND P.Category IN (‘Printer’, ‘Scanner’)
GROUP BY CUBE Year, State, Product_Name;

Assuming the data warehouse (Figure 1) maintains sales records of 10 years, we can straight away notice that only the last two years’ sales rows will be picked from the fact table. The second restriction predicate on product category effects a further reduction.

We use a simple method viz. to form rowsets (of the fact table) for one restriction predicate. The restriction predicates are applied to their corresponding dimensions in parallel. The JVIs that satisfy each predicate are passed on to a coordinator node where an intersection of the RIDs of all the restricted dimensions is performed. The resulting rowset (set of row IDs) thus formed is the result of the star join. The dimensions for which no restriction predicates are present do not participate in this operation as they annot cause any reduction to the resulting rowset. While operating on the fact table we try to distribute equally to each node available as much processing as is possible, so that I/O and computation could be performed efficiently in parallel. Algorithm 2 shows how the parallel star join is performed.

The schema models the activities of a world-wide wholesale supplier over a period of seven years. The fact table is the SALES table, and the dimension tables are the PART, SUPPLIER, CUSTOMER, and TIME tables. The fact table contains foreign keys to each of the dimension tables. This schema suggests an eÆcient data partitioning as we will soon show. 

A common type of query in OLAP systems is the star-join query. In a star-join, one or more dimension tables are joined with the fact table.


Query 1
SELECT U.Name, SUM(S.ExtPrice)
FROM SALES S, TIME T, CUSTOMER C, SUPPLIER U
WHERE T.Year BETWEEN 1996 AND 1998
AND U.Nation='United States' AND C.Nation='United States'
AND S.ShipDate = T.TimeKey AND S.CustKey = C.CustKey
AND S.SuppKey = U.SuppKey
GROUP BY U.Name


A set of attributes that is frequently used in join predicates can be readily identi_ed in the structure of a star schema. In the example star schema, ShipDate, CustKey, SuppKey, and PartKey of the SALES table can be identi_ed as attributes that will often participate in joins with the corresponding dimension tables. We can thus use this information to apply a vertical partitioning method on these attributes to achieve the bene_ts of parallelism. This paper shows, in fact, that one can use a combination of vertical and horizontal partitioning techniques to extract the parallelism inherent in star schemas. Speci_cally, we propose a declustering strategy which incorporates both task and data parti- tioning and present the Parallel Star Join (PSJ) algorithm,
which provides a means to perform a star join in parallel using eÆcient operations involving only rowsets and projection columns.


The Parallel Star Join Algorithm:-
In this section we present our algorithm to perform star joins in parallel.
We represent a general k-dimensional star-join query as follows.

Query 3
SELECT Ad
P , Am
P FROM F, D1, : : :, Dk WHERE P./ AND P_
Here D1; : : : ;Dk are the k dimensional tables participating in the join. P_ and P./ denote a set of
restriction and join predicates respectively. We assume that each individual restriction predicate
in P_ only concerns one table and is of the form (aj hopi constant), where aj is any attribute in the
warehouse schema and hopi denotes a comparison operator (e.g., =;_;_). We assume each join
predicate in P./ is of the form al = at where at is any dimensional key attribute and al is the foreign
key referenced by at in the fact table.

Fin611 Assignment No. 2

Saturday, June 11, 2011 Posted In Edit This
Assignment No. 02 Marks: 15

ABC Company Ltd. has the following Trial Balance as on 31st December, 2010:

AdjustmentsThe inventory at the end of the period is valued at Rs.27,500• General expenses include prepayments to the amount of Rs.1,500• A provision for Income tax is necessary to the extent of Rs.7,500• Directors recommended final dividend . 5%• Depreciation should be written of on all fixed assets . 10% p.a.• Company has authorized capital of 40,000 shares of Rs.20 each

Require1. Income Statement for the period ended on 31st December, 20102. Statement of Changes in Equity for the period ended on 31st December, 20103. Balance Sheet as on 31st December, 2010



Schedule
Opening Date and Time June 09, 2011 At 12:01 A.M. (Mid-Night)
Due Date and Time June 15, 2011 At 11:59 P.M. (Mid-Night)

Fin622 Assignment No. 2 solution

Saturday, June 11, 2011 Posted In Edit This
“Corporate Finance (FIN622) ”
Assignment No. 02 
Marks: 15


Question:
SNT Company has been dealing in the business of books for five years. The company faces steady demand for the books. So, it replenishes the supply by placing an order for more books from the publisher whenever there is inventory shortage. The company is planning to buy 200,000 books over the coming year. Each order that it places costs Rs. 75 and the annual carrying cost of the inventory is Rs. 0.10 per book. The company can place either a single order or multiple orders as provided in the following table. Average inventory over the year would be half of the order size and therefore carrying costs would be calculated accordingly.

(a) Fill in the table by keeping above information into consideration. (10)


Order Size
Orders per year
Average Inventory
Ordering Costs
Carrying Costs
Total Costs
200,000
100,000
50,000
20,000
10,000

(b) Which order should be placed by SNT Company according to the table and why? (2)
(c) Calculate Economic Order Quantity. Is your answer consistent with your findings in part (b)?


Schedule
Opening Date and Time
June 09, 2011 At 12:00 A.M. (Mid-Night)

Due Date and Time
June 14, 2011 At 11:59 P.M. (Mid-Night)


Solution:


Thanks to share/email us this solution Mr. Fahad Yosha:


Mth302 Assignment No. 2 solution

Saturday, June 11, 2011 Posted In Edit This
MTH302 (Spring 2011)

Total marks: 40
Lecture # 23 to 29
Due date: 16-06-2011

DON’T MISS THESE IMPORTANT INSTRUCTIONS:

Upload assignment properly through LMS, (No Assignment will be accepted through email).
All students are directed to use the font and style of text as is used in this document.
This is an individual assignment, not a group assignment, so keep in mind that you are supposed to submit your own, self made & different assignment even if you discuss the questions with your class fellows. All similar assignments (even with some meaningless modifications) will be awarded zero marks and no excuse will be accepted. This is your responsibility to keep your assignment safe from others. Many solution files sent by students in assignment 1 are found to be copied and so awarded zero. You are therefore reminding here again.
All questions are graded and solve properly on word file.
Solve the assignment on MS word document and upload your word (.doc) files only. Do not solve the assignment on MS excel. If we get any assignment on MS excel or any format other than word file then it will not be graded.

Question 1: Marks=10
A shopkeeper wants to know whether there exists any correlation between sale of shampoo and sale of conditioner or not. Suppose that he selects a random sample of four brands out of all brands that he sale. The following information is obtained:
Brands
Sale of Shampoo

A
5
2
B
7
3
C
10
7
D
14
10


Question 2: Marks=10 
The weekly sale of a product A was recorded as below. Find the coefficient of variation of weekly sale for the Product A.
Product A
60
58
42
25
35
45
15

Question 3: Marks=10 
Find the Relative Frequency, Cumulative Frequency, and Percent Relative Frequency of the data given in the table below.

Class
Frequency
( f )
10 – 15
5
15 – 20
4
20 – 25
6
25 – 30
2
30 – 35
3
35 – 40
7
40 – 45
5


Question 4: Marks=10
The given data is 69, 93, 53, 45, 66, 71, 89, 95, 97, 103, 75
Determine the 5-point summary ( i.e. Smallest value, 1st Quartile (Q1), Median (Q2), 3rd Quartile (Q3), Largest value ).


Solution:



Q No. 2 Solution:

See Handouts:
60 58 42 25 35 45 15
Lecture Number 28 Page 204
Coefficient of variation is equal to the standard deviation divided by mean
multiply by 100.
Coefficient of variation=standard deviation/mean * 100
c.v=s/mean *100
Solve By yourself

Q No. 3 Solution
See Page # 170 Of handouts
Lecture Number 24
Relative frequency :
Relative Frequency of a class = Frequency of the class interval
---------------------------------------
Total Frequency
Cumulative Frequency:
If we add frequency of the second interval to the frequency of the first interval , then the
cumulative frequency for the second interval is obtained
Percent cumulative relative frequency:
This can be calculated same as cumulative frequency except now percent relative
frequency for each class interval is considered. The percent cumulative relative
frequency of the last class interval is 100% as all observations have been added.
Class
10 – 15
15 – 20
20 – 25
25 – 30
30 – 35
35 – 40
40 – 45
Frequency
(f)
5
4
6
2
3
7
5
32

Relative% Relative
Frequency Frequency
0.1616
0.1212
0.1919
0.066
0.099
0.2222
0.1616
1100















Q No. 4:

See on Handouts
Lecture 26 Page Number 186
Quartiles divide data into 4 equal parts
Syntax
1st Quartile Q1=(n+1)/4
2nd Quartile Q2= 2(n+1)/4
3rd Quartile Q3= 3(n+1)/4

Lecture 27 Page Number 194
5-NUMBER SUMMARY:
5 number summary is:
• Smallest value
• 1st Quartile (Q1)
• Median (Q2)
• 3rd Quartile (Q3)
•Largest

Arrange the data in ascending order
45, 53, 66, 69, 71, 75, 89, 93, 95, 97, 103
Smallest Value = 45
1st Q= (n+1)/4
= (11+1)4
= (12)/4
=3
2nd Q= 2(n+1)/4
=2(11+1)/4
=2(12)/4
=2(3)
=6
3rd Q= 3(n+1)/4
=3(11+1)/4
=3(3)
=9
Largest Value= 103
:::::::::::::::::::::::::::::::::::::::



Class
Frequency f
Relative Frequency
% Relative Frequency
10-15
5
0.16
16
15-20
4
0.12
12
20-25
6
0.19
19
25-30
2
0.03
6
30-35
3
0.09
9
35-40
5
0.16
16

32
1
100



Q No. 4
See on Handouts
Lecture 26 Page Number 186
Quartiles divide data into 4 equal parts
Syntax
1st Quartile Q1=(n+1)/4
2nd Quartile Q2= 2(n+1)/4
3rd Quartile Q3= 3(n+1)/4
Lecture 27 Page Number 194
5-NUMBER SUMMARY:
5 number summary is:
• Smallest value
• 1st Quartile (Q1)
• Median (Q2)
• 3rd Quartile (Q3)
•Largest value
Arrange the data in ascending order
45, 53, 66, 69, 71, 75, 89, 93, 95, 97, 103
Smallest Value = 45
1st
 Q= (n+1)/4
= (11+1)4
= (12)/4
=3
2nd
 Q= 2(n+1)/4
=2(11+1)/4
=2(12)/4
=2(3)
=6
Q2 Value: 75
3rd
 Q= 3(n+1)/4
=3(11+1)/4
=3(3)
=9
Largest Value= 103
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::


Answer : NO (3)

Class Frequency Relative frequency % Relative frequency Cumulative
Frequency %Cumulative
Frequency
10-15 5 0.16 16 5 25
15-20 4 0.12 12 9 29
20-25 6 0.19 19 14 45
25-30 2 0.06 6 16 51
30-35 3 0.09 9 19 61
35-40 7 0.22 22 26 83
40-45 5 0.16 16 31 100
Total 32 1 100


Answer no. 4

First ordered array

45,53,66,69,71,75,89,93,95,97,103

Smallest value= 45

Largest value= 103

Q1= 1(n+1)/4

=11+1/4= 3

Q1= 3rd value =66

Q2= 2(n+1)/4

=2(11+1)/4

=6

Q2 = 6th value = 75

Q3 = 3(n+1)/4

. =3(11+1)/4

= 9

Q3 = 9th value = 95

Answer (2)
Mean = sum(A)/n = 280/ 7 = 40

CV= ?
SD = ?

A (X-MEAN) (X-MEAN)^2
60 -20 400
58 -18 324
42 -2 4
25 15 225
35 5 25
45 -5 25
15 25 625
Total=280 Total=1628

S.D = [{sum(x-mean)^2/n-1}^1/2]

S.D = (1628/6)^1/2

S.D = 16.47

C.V = S/mean*100 = 16.47 / 40*100 = 0.41 = 41.175



Ans(1)

Y X XY X^2 Y^2
5 2 10 4 25
7 3 21 9 49
10 7 70 49 100
14 10 140 100 196
Total=36 Total=22 Total=241 Total=162 Total=370

= 43/43.43

= 0.99

Very high correlation

Back to home VUsolutions

Shaadi.com: Just create ur account & find ur partner or EARN money, its reall & EASY

VUsolutions Followers (Join NOW and Get Extra Benefits)

Install LATEST toolbar having lot of features - GET solutions on Desktop

toolbar powered by Conduit
Caliplus 300x250 NoFlam VitoLiv 468x60 GlucoLo