Tuesday, December 27, 2005
Sunday, December 25, 2005
configuration parameter
1. the splitting strategy consists of two: size-based and number-based
2. the placement of views (indexes) consists of two: hop-based and latency-based
2. the placement of views (indexes) consists of two: hop-based and latency-based
Thursday, December 22, 2005
adding a last step to take into account the retrieval of complete tuples
Add a last step that retrieves the complete tuples by using Chord protocol
Friday, December 16, 2005
reading "system simulation"
1. when the outcome of an activity can be described completely in terms of its input, the activity is said to be deterministic, while if the effects of the activity vary randomly over various possible outcomes, the activity is said to be stochastic.
2. systems such as the aircraft, in which the changes are predominantly smooth, are called continuous systems, while systems like the factory, in which changes are predominantly discontinuous, will be called discrete systems. There are also systems that are intrinsically continuous but information about them is only available at discrete points in time, these are called sampled-data systems.
3. One way of obtaining independt result is to repeat the simulation. Repeating the experiment with different random numbers for the same sample size n gives a set of independent determinations of the sample mean.
4. besides bach means, replication of runs, and regenerative techniques, time series analysis and spectral analysis are two other techniques to capture the autocorrelation characterisitcs among discrete events
4. To remove the initial bias, two general approaches can be taken to remove the bias: the system can be started in a more representative state than the empty state, or the first part of the simulation run can be ignored.
5. Another approach to the problem of establishing confidence intervals for simlation results does not rely upon replication, but uses a single long run, preferably with thee initial bias removed.
6. Frequently, the service time of a process is constant; but, where it varies stochastically, it must be described by a prob. function. If the service time is considered to be completely random, it may be represented by an exponential distribution, or a common situation is that, although the service time should be a constant, there are random fluctuations due to uncontrolled factors. The normal, or Gaussian, distribution is ofent used to represent the service time under these circumstances.
2. systems such as the aircraft, in which the changes are predominantly smooth, are called continuous systems, while systems like the factory, in which changes are predominantly discontinuous, will be called discrete systems. There are also systems that are intrinsically continuous but information about them is only available at discrete points in time, these are called sampled-data systems.
3. One way of obtaining independt result is to repeat the simulation. Repeating the experiment with different random numbers for the same sample size n gives a set of independent determinations of the sample mean.
4. besides bach means, replication of runs, and regenerative techniques, time series analysis and spectral analysis are two other techniques to capture the autocorrelation characterisitcs among discrete events
4. To remove the initial bias, two general approaches can be taken to remove the bias: the system can be started in a more representative state than the empty state, or the first part of the simulation run can be ignored.
5. Another approach to the problem of establishing confidence intervals for simlation results does not rely upon replication, but uses a single long run, preferably with thee initial bias removed.
6. Frequently, the service time of a process is constant; but, where it varies stochastically, it must be described by a prob. function. If the service time is considered to be completely random, it may be represented by an exponential distribution, or a common situation is that, although the service time should be a constant, there are random fluctuations due to uncontrolled factors. The normal, or Gaussian, distribution is ofent used to represent the service time under these circumstances.
the simulation should be redesigned
1. should fix the premise: either we deploy tables intensionally by using uniform distribution or locality-preserving, or the data are already deployed
2. the query load is too small, and should check the queueing mechanism if the simulator and see how it can invokes resource contention
3. regarding locality-preserving strategy, should first test based on the resource contention (query load), then storage contention (storage load)
2. the query load is too small, and should check the queueing mechanism if the simulator and see how it can invokes resource contention
3. regarding locality-preserving strategy, should first test based on the resource contention (query load), then storage contention (storage load)
Thursday, December 08, 2005
some facts
response time = waiting time + service time
Service time is the time that it takes the hardware to service an I/O request.
Waiting time may be needed if the hardware is busy.
In networking, the amount of time it takes a packet to travel from source to destination. Together, latency and bandwidth define the speed and capacity of a network.
Wednesday, December 07, 2005
each peer issues one query over the same table
otherwise, the data structure will be screwed by different queries
schedule
1. prepare multiple tables
2. run more queries
3. run over multiple tables individually and combine the results
4. deploy mulitple tables and analyze the results
5. realize the locality-preserving one
6. compare three strategies
7. draft a two-page description of the index-based strategy
2. run more queries
3. run over multiple tables individually and combine the results
4. deploy mulitple tables and analyze the results
5. realize the locality-preserving one
6. compare three strategies
7. draft a two-page description of the index-based strategy
schedule
1. prepare multiple tables
2. run more queries
3. run over multiple tables individually and combine the results
4. deploy mulitple tables and analyze the results
5. realize the locality-preserving one
6. compare three strategies
7. draft a two-page description of the index-based strategy
2. run more queries
3. run over multiple tables individually and combine the results
4. deploy mulitple tables and analyze the results
5. realize the locality-preserving one
6. compare three strategies
7. draft a two-page description of the index-based strategy
