Multicast - directions for implementation

Background

  • Problem description
    • Large number of "consumers" (users who want to download Mandrake packages);
    • Limited number of mirrors, not necessarily reliable;
    • Constant updates to the package repository (or rather repositories);
    • Up to tens of gigabytes may need to be transferred to some users;
    • Heterogeneity of users' needs, bandwidth, etc.
  • Problems with classic P2P
    • Up to ~10K files (for some users) to search peers for downloading;
    • Not necessarily enough peers for a given file (e.g. esoteric architecture or application);
    • Depends on users' will to share (no incentives);
    • Proves to be not fast enough today (for ISO CD images ? packages can't be downloaded with P2P).
  • Solution ? Multicast over P2P overlay
    • Push instead of pull;
    • Peers can be seen as "n-th degree mirrors";
    • Less bandwidth consuming for the network and the mirrors;
    • Peers do not have to search for other peers to download from;
    • Can be at set times or repeating (or combination);
    • Can be used as a heuristic to "seed" the files at many peers and increase efficiency of later P2P downloading (e.g. for latecomers).

Multicast tree-based approaches

  • Multicasting a file ? naïve approach
    • Create a multicast tree in the overlay network.
    • Source node can be a Mandrake server or an arbitrary peer who gets the file from the main server.
    • Propagate the file through the multicast tree.
  • Naïve approach problems
    • Heterogeneity of peers ? bandwidth suffers and degrades the farther along the tree;
    • One misbehaving node "prunes" its whole sub-tree which doesn't get any more updates;
    • Internal nodes carry whole burden
    • with a fan-out of 6 less than 10% of the nodes are internal nodes ? serving the other 90%;
    • Still depends on users' will to donate bandwidth;
    • Need to deal with lost packets.
  • Solution ? Splitting streams
    • Split a block into multiple pieces, using an error-correcting scheme (e.g. erasure codes). For example: Split into 2n packets, any n of which suffice to rebuild the whole block.
    • Multicast each different piece to a different multicast tree (same peers ? but arranged differently).
    • Each peer can be an internal node in just one tree.
    • Node failure affects only one packet. Packet redundancy increases chance of rebuilding block.
    • State of the art: Split Stream, Coop Net.
  • Splitting streams advantages
    • More fair ? a peer is an internal node only for a small part of the packets;
    • Resilient to some nodes' failure due to redundancy;
    • Can be a framework to implement incentive methods.
  • Splitting streams disadvantages
    • Harder to manage ? creating and maintaining multiple trees;
    • Redundancy wastes some bandwidth.
  • Splitting streams problems
    • Splitting streams is (apparently) the best method when using a multicast tree.
    • But multicast trees have some inherent problems:
    • Each peer's incoming bandwidth depends on its parents alone. Keeping the incoming pipe "full" is not up to the peer.
    • Dependent on upper levels in the tree. Splitting streams helps but doesn't solve the core problem.
    • Sensitive to high churn ? need to repair the tree.
    • The solution: mesh-based approach.

Multicast mesh-based approaches

  • Mesh-based approaches
    • The idea: don't constraint peers in trees. Allow peers to connect randomly to other peers.
    • Each peer should keep its incoming pipe full and should connect to other peers for that purpose.
    • An underlying control infrastructure may be built, e.g. a tree for passing control messages.
    • A mesh, as opposed to a tree, derives a need for a mechanism to locate useful peers.
    • The control infrastructure can be used for that.
  • Mesh-based considerations
    • Push?
      • Push methods are simpler for the receiver ? no need to manage senders and requests.
      • Senders might send duplicate packets.
      • Receivers have to inform senders about what they have.
    • Pull?
      • The receiver has to manage its senders and allocate packets to send.
      • For this, it needs to discover what packets peers have.
      • More control of a peer on its incoming pipe.
    • Peering strategy
      • Centralized ? peers contact a server to get a list of peers downloading/sharing the same file. Not scalable.
      • Fixed set of peers ? not resilient to changes in the network.
      • Using DHT/DOLR ? might not perform well for looking up multiple peers.
      • We need a distributed and dynamic way to get updated list of peers.
    • Request strategy - Assuming pull
      • In which order to request the missing blocks? We want to increase the block variety.
      • From which peer to request each block?
      • How much data should be requested from each sender?
    • Request strategy - Assuming push
      • How to synchronize with other senders, so as to not send the same block multiple times?
    • Some more considerations
      • Sending strategy ? assuming a sender knows which blocks to send, in what order should it send them?
      • Should the system be oriented towards fairness or towards speed?
  • State of the art
    • Avalanche
      • Similar to BitTorrent ? uses a centralized tracker for getting updated peer lists.
      • Uses "tit-for-tat" incentives and shows that it makes the system slower (fairness/speed trade-off).
      • Uses network coding so that peers will be able to help each other with high probability.
    • Slurpie
      • Also uses a centralized tracker, but it only points to the last peers who've requested to download the file.
      • Nodes discover new nodes by passing update messages between each other and can add new useful neighbors depending on the bandwidth.
    • Chainsaw
      • Peers are connected randomly in a mesh.
      • When a peer gets a new block, it notifies its neighbors so they could request it.
      • Blocks are serialized in sliding "windows".
    • Bullet'
      • Hybrid approach ? source pushes the updates, the others pull.
      • Separation between sending and receiving peers.
      • Peers discover other useful peers by running RanSub on an underlying control tree.
      • Peers added and replaced according to their usefulness and bandwidth.
      • Request strategy is "Rarest Random".
      • Several more strategies are employed to make Bullet' results compare favorably to other systems, including BitTorrent.
  • Building the infrastructure
    • Problem description
      • Not all peers participate in the multicast.
      • Peers will be grouped into clusters, each forming its own control structure and mesh.
      • Even inside a given cluster, not all peers will necessarily be interested in the content.
      • How do we form those structures efficiently and without involving uninterested peers in forwarding multicast content?
    • Subscriber-Volunteer Trees
      • The S/V Trees approach tries to remedy the "non-interested forwarder" problem of some tree-based multicast systems.
      • Peers form an S/V tree on top of the RPF (Reverse Path Forwarding) overlay tree.
      • All nodes still have to forward control messages, but are exempt from forwarding multicast content.
    • Usefulness of S/V trees
      • Use S/V trees wherever overlay trees might be used in our system (e.g. event notification).
      • If we use a mesh-based multicast approach, then the overlay tree could gain from being part of an S/V tree.
      • Also, perhaps we could create a "forest" of trees in a given cluster, so that nodes will not have to forward any content they're not interested in.

More possibilities

  • Handling multiple files
    • Users usually need more than one package or file. Usually, the packages will be related.
    • A package might be needed as part of more than one "bundle".
    • But we'd like to send it only once.
    • Need a method to balance the following trade-off:
      • Minimize the number of separate multicast trees a peer is part of.
      • Minimize the number of packages a peer receives but is not interested in.
    • If we use clustering, then this problem lessens. Also, as written previously, we could check out using S/V trees.
  • Open issues
    • Building the multicast tree(s)
      • Centralized or decentralized?
      • Minimize bandwidth for peers who don't participate in the multicast
      • Relation to peer incentives?
    • Solving the trade-off of multiple package trees
    • Security

Some desired properties and considerations

  • ~1M users
  • Max file size, avg file size: ?
  • "Ad-hoc" tree (build & throw away) - how do the interested peers build the tree?
  • High churn rate ? dial-up users etc. - perhaps we can "ignore" it (make it impossible to join the tree after multicast has started, for example).
  • Want to reduce load on peers, perhaps OK on main server.
  • Want to reduce (pref. to 0) handling unwanted multicast traffic.
  • Acceptable to miss a notification periodically (?)
  • Can utilize today's mirrors as "trusted" peers / additional source nodes?
  • Some of the files may exist beforehand on the network and will not (?) be multicast again.
  • Peers who got the file should register with the persistent storage system.

State of the Art

Bayeux

Scribe

Split Stream

Bullet/Bullet'

Coop Net

Chainsaw

MULTI+

Bullet' and Chainsaw seem to be the most mature system and show the best results (in simulation). Both have approaches involving pulling the information (pull only in Chainsaw, pull+push in Bullet'). We need to build a system based on either one of these two (or perhaps combining both), taking into account our own scenario and use cases, most notably - building the tree only with interested peers and multicasting several files efficiently. The chosen system should also be able to co-operate with the system that will be chosen for the persistent storage.

See Reliable multicast on P2P overlays: Scribe and SplitStream by Anne Marie Kermarrec

Version 1.3 last modified by StephaneLauriere on 27/06/2005 at 00:51

Comments 0

No comments for this document

Attachments 0

No attachments for this document

Creator: Yotam on 2005/05/22 19:49
Copyright EDOS Consortium
1.1.1