Tuesday, December 27, 2005

resource capacity per peer

use parallelism to characterize the resource capacity of each peer,

Sunday, December 25, 2005

configuration parameter

1. the splitting strategy consists of two: size-based and number-based
2. the placement of views (indexes) consists of two: hop-based and latency-based

Thursday, December 22, 2005

adding a last step to take into account the retrieval of complete tuples

Add a last step that retrieves the complete tuples by using Chord protocol

Friday, December 16, 2005

reading "system simulation"

1. when the outcome of an activity can be described completely in terms of its input, the activity is said to be deterministic, while if the effects of the activity vary randomly over various possible outcomes, the activity is said to be stochastic.

2. systems such as the aircraft, in which the changes are predominantly smooth, are called continuous systems, while systems like the factory, in which changes are predominantly discontinuous, will be called discrete systems. There are also systems that are intrinsically continuous but information about them is only available at discrete points in time, these are called sampled-data systems.

3. One way of obtaining independt result is to repeat the simulation. Repeating the experiment with different random numbers for the same sample size n gives a set of independent determinations of the sample mean.

4. besides bach means, replication of runs, and regenerative techniques, time series analysis and spectral analysis are two other techniques to capture the autocorrelation characterisitcs among discrete events

4. To remove the initial bias, two general approaches can be taken to remove the bias: the system can be started in a more representative state than the empty state, or the first part of the simulation run can be ignored.

5. Another approach to the problem of establishing confidence intervals for simlation results does not rely upon replication, but uses a single long run, preferably with thee initial bias removed.

6. Frequently, the service time of a process is constant; but, where it varies stochastically, it must be described by a prob. function. If the service time is considered to be completely random, it may be represented by an exponential distribution, or a common situation is that, although the service time should be a constant, there are random fluctuations due to uncontrolled factors. The normal, or Gaussian, distribution is ofent used to represent the service time under these circumstances.

the simulation should be redesigned

1. should fix the premise: either we deploy tables intensionally by using uniform distribution or locality-preserving, or the data are already deployed
2. the query load is too small, and should check the queueing mechanism if the simulator and see how it can invokes resource contention
3. regarding locality-preserving strategy, should first test based on the resource contention (query load), then storage contention (storage load)

Thursday, December 08, 2005

some facts

response time = waiting time + service time

Service time is the time that it takes the hardware to service an I/O request.
Waiting time may be needed if the hardware is busy.

In networking, the amount of time it takes a packet to travel from source to destination. Together, latency and bandwidth define the speed and capacity of a network.

PIER computeLocationID

in PIER, the location id combines name space hash string and resource id

Wednesday, December 07, 2005

each peer issues one query over the same table

otherwise, the data structure will be screwed by different queries

schedule

1. prepare multiple tables
2. run more queries
3. run over multiple tables individually and combine the results
4. deploy mulitple tables and analyze the results
5. realize the locality-preserving one
6. compare three strategies
7. draft a two-page description of the index-based strategy

schedule

1. prepare multiple tables
2. run more queries
3. run over multiple tables individually and combine the results
4. deploy mulitple tables and analyze the results
5. realize the locality-preserving one
6. compare three strategies
7. draft a two-page description of the index-based strategy

Friday, December 02, 2005

question: why naive tree-multicast has smaller latency

need to use timing tool to check, gprof may work