Friday, March 15, 2019

Distributed Queries& Replication


The development of large effective databases requires the application of distributed computing because engineers can create it at relatively low costs in the absence of any specialized technology. There are several cost models for the effectively distributed query optimization (Haeurlain et al., 2008). One of these is a model created by Lanzelotte, Valduriez and Zait are the dynamic program cost model that captures all the elements of parallelism and scheduling (Aljanaby et al., 2004). According to the model the query executing plan only contains join nodes; moreover the datasets of a finite order of a node get separated among nodes of different homes (Aljanaby et al., 2004). The nodes of different homes do not have partitions. The researchers define the cost of the plan as three elements that are total work (TW), response time (RT) and memory consumption (MC). Total work and response time depict the exchange between response time and the throughput (Aljanaby et al., 2004). The third shows how much memory the execution of the plan requires. The cost model relies on the dynamic parameters; nevertheless it is crucial that the engineers make the decisions related to the optimization cost at the run time. It demands that the engineers create some execution plans that they put together by choosing operators (Aljanaby et al., 2004).

The second model is the distributed cost model. The distributed query optimization produces a plan for processing of a query to a distributed system. The cost model predicts the cost of a particular execution plan that consists of the secondary storage cost, memory storage cost, computation cost and the communication cost (Taniar et al., 2008). The researchers make an assumption that the system memory does not have enough space that affects the dominant processes that are part of the execution time of a plan (Taniar et al., 2008). In general, it is possible to calculate the cost of the entire plan by totaling the cost of individual operators. These individual processes must carry out a post-order traversal of the execution plan (Taniar et al., 2008).
Factors that impact the performance of query execution strategies
The role of query processing is to raise questions that identify the precise point for the execution of queries such that they minimize the costs of communication and also the response time of a query. There are some factors that influence the performance of query execution strategies (Raipurkar& Bamnote 2013). The primordial factor that impacts the performance of the query execution strategies is the ability to exploit parallelism between the clients and servers. The interactions linking the client and the server define the cost model and the response time of the execution of the query strategy (Raipurkar& Bamnote 2013). The client-server relationship is mandatory as it demands the correctness of the execution strategy. The execution strategy has to be correct with respect to the user’s transaction. The relationship between the server and the client also affects the choice of the correct execution strategy that optimizes the execution performance (McDermid, 1991). The interaction with the features of the client-server environment affects the quality of the plans by directly affecting the cost and response time of the cost model. Another factor that affects the execution of the query strategy is the dynamics of the plan. The structure of the plan defines the specific setting in which the plan gets implemented (McDermid, 1991).  It determines the optimization levels of the query strategy. The structure of the plan should ensure that the query strategy achieves the highest level of performance at the most appropriate cost. The arrangement of data transmission and the local data processing must be set up in a manner that they have a minimal response time. They should also have minimal total time for a particular class of queries (Raipurkar& Bamnote 2013).
Comparison between the replication cycle of TimesTen and the 2PC site termination protocol
There are some similarities between the replication of TimesTen and the 2PC site termination protocol. One of these similarities includes the fact that the replicated data in the updates for both is consistent (Özsu& Valduriez, 2011). When the engineers update the data in the databases, they ensure that the backup data is similar to that of the replicate. The coordinator uses it for referral purposes in case there is a problem with the replicate (Özsu& Valduriez, 2011). The data is consistent between the master and the subscriber databases. Another similarity between the two is that the database engineers link the replicas such that update or change in the original database results in a similar change in the replica that users view. The TimesTen's irreversible active-standby pair arrangement solely applies a distinct strategy that provides completely contemporary replication between the active site and the standby site (Özsu& Valduriez, 2011).  Another correspondence equating the replication of TimesTen and site termination protocol of 2PC is the diversity of the communication paradigms (Reimann et al., 2011). These strategies also have some similarities such as the protocols that allow the participants to communicate with one another and others that restrict the communication between parties. Even though the different communication paradigms bear the difference in the name, the structures of the protocols in the paradigms are similar. The protocol optimization strategies also bear some similarities (Reimann et al., 2011). The termination protocols for the replication cycle of TimesTen bare any resemblance to the site failure termination protocol for 2PC is similar. The ability to configure your replication scheme to direct the master replication agent to commit all transactions that timeout is optional in both cases (Reimann et al., 2011).


Reference
Aljanaby, A.,  Emad Abuelrub, E& Odeh, M., (2004). A Survey of Distributed Query Optimization. The International Arab Journal of Information Technology, Vol. 2, Issue 1, page 48-57.
Hameurlain, A., Morvan, F., & Samad, M. (2008). Large scale data management in grid systems: a survey. In Information and Communication Technologies: From Theory to Applications, 2008. ITTA 2008. 3rd International Conference, page 1-6. IEEE.
McDermid, J. (1991). Software engineer's reference book. Oxford: Butterworth-Heinemann.
Özsu, M., & Valduriez, P. (2011). Principles of distributed database systems. New York: Springer Science+Business Media.
Raipurkar, A., & Bamnote, G., (2013). Query Processing In Distributed Database through Data Distribution. International Journal of Advanced Research in Computer and Communication Engineering, Vol. 2, Issue 2, page (1134-1139).
Reimann, P., Reiter, M., Schwarz, H., Karastoyanova, D., & Leymann, F. (2011). SIMPL-A Framework for Accessing External Data in Simulation Workflows. In BTW (Vol. 11, pp. 534-553).
Taniar, D., Leung, C. H., Rahayu, W., & Goel, S. (2008). High-performance parallel database processing and grid databases (Vol. 67). John Wiley & Sons.


Carolyn Morgan is the author of this paper. A senior editor at MeldaResearch.Com in legitimate essay writing service. If you need a similar paper you can place your order from research paper services.

No comments:

Post a Comment

Buy thesis Online for Cheap

We are keen on ensuring that, any time students Buy thesis Online papers from our website, they get good grades that align with their expec...