Investigating the Viability of Peer-to-Peer Applications using BitTorrent Sync

Kieran Richard Hunt

Abstract

With recent revelations surrounding the confidentiality of data entrusted to online file synchronization services there has been a push towards developing similar systems that do not rely on external servers which the user does not have control over. One such service is BitTorrent Sync. BitTorrent Sync provides the ability for users to synchronize folders across multiple devices. The BitTorrent Sync developers also provide an API for accessing the application programmatically. The goal of this research is to demonstrate that functional applications can be developed using a distributed system.
Two applications were developed for the purpose of this research. One application mimics the functionality of Twitter whilst the other mimics that of a Facebook group. These applications have been developed using BitTorrent Sync as a way of distributing the content between users.
This research shows that it is viable to replicate most of the basic functionality of existing, traditionally client-server, applications using a Peer-to-Peer approach.
Software and its engineering → Peer-to-peer architectures
Computer systems organization → Client-server architectures
I would firstly like to thank my supervisor, Professor Philip Machanick, for his mentorship and council throughout this research. His advice has proved invaluable in guiding my thoughts and progress throughout this venture.
I am indebted to my family for both their emotional and financial support throughout the year. Their teachings and encouragement have shaped me into the person I am today.
I would also like to thank Professor Barry Irwin for his support and guidance this year.
This work was undertaken in the Distributed Multimedia CoE at Rhodes University, with financial support from Telkom SA, Tellabs, Genband, Easttel, Bright Ideas 39, THRIP and NRF SA (TP13070820716). The authors acknowledge that opinions, findings and conclusions or recommendations expressed here are those of the author(s) and that none of the above mentioned sponsors accept liability whatsoever in this regard.
\pagenumberingRoman
Table of Contents
List of Figures
list-listing
List of Tables

1 Introduction

\pagenumberingarabic
By 2014, the much of the Internet has converged to the same point. Users of the Internet interact with companies and each-other using a client-server model. The users, called clients, request services from providers who run servers. These end points usually communicate over the Internet. The rise of recent revelations regarding privacy when using these services [A]  [A] http://www.theguardian.com/world/edward-snowden has led to some users seeking more secure alternatives to commonly used services on the Internet. One of these services that is now very widely used [7] is Dropbox.
Dropbox allows its users the ability to synchronize folders across multiple computers as well as keep an online backup which is accessible via any web browser. Dropbox has made clients available on many many platforms so that you are always able to synchronize your files. It also has a massive users-base with 50 million registered users. Recently with the appointment of Condoleezza Rice [B]  [B] http://www.drop-dropbox.com/ onto its Board of Directors and their approach to government requests for data [C]  [C] https://www.dropbox.com/transparency/principles has left people looking for alternatives.
BitTorrent, the company which maintains and develops the widely used BitTorrent protocol, has developed a system to synchronize files between computers using a purely peer-to-peer approach. Their product, BitTorrent Sync, rivals tools like Dropbox, iCloud and Google Drive without offering online backup. BitTorrent has opened up an application programming interface (API) which allows for developers to create programs on top of their system as a way of developing networked and truly distributed applications.
Having a platform for distributed applications without the massive server overhead has opened up an unexplored realm. In this thesis we will demonstrate the re-development of a few existing and popular web applications in order to check their viability as well as how well they scale. This project aims to show that this approach to distributed systems can be a viable and adoptable.

1.1 Traditional Concepts

1.1.1 Twitter

figure Screenshots/bitTorrentSync-tweet.png
Figure 1.1 A Sample Tweet. @BitTorrentSync
Twitter is the first of the traditionally client-server applications that this project aims to replicate. Its function is similar to that of RSS [D]  [D] http://cyber.law.harvard.edu/rss/rss.html or Atom [E]  [E] http://www.ietf.org/rfc/rfc4287.txt feeds. A user will subscribe to receive updates from other users. Whenever a user makes a posting (or tweet) all other users can potentially receive that tweet and it is added onto a list of other tweets, usually in chronological order.
Much of what Twitter offers its users very easily translates over to a peer-to-peer P2P model and that it why it has been chosen as the initial testbed. Tweeting, direct messaging, retweeting, replying and favouriting are at the heart of the Twitter experience and they are all replicatable without massive central servers. A simple tweet (as exemplified in Figure 1.1↑) via Twitter would usually be entered by the user. This would then be sent to large server bank somewhere where it would be added to a database. When another user who is subscribed to their tweets logs on, the tweet would be read from the database and displayed alongside the other tweets on their feed. In a P2P situation, the tweet would be added to a BitTorrent Sync folder. The people who are subscribed to receive updates to that folder would then commence downloading the tweet from the original user and the other users in the swarm who have pieces to share.
A retweet is the process of replicating a tweet that someone else has made so that the people who are subscribed to you will see it as well. This is performed on Twitter by clicking on the retweet button. In Figure 1.1↑, the retweet button is represented by the two right-angled arrows in a circle. With P2P, the retweet would be exactly the same as tweeting except that a copy of the tweet would be transferred from the folder that contains your synchronized tweets with the original user to the folder that contains all of your tweets. Some kind of reformatting would have to take place so that the tweet looks as if it has originated from someone other than you and that you were the one who retweeted it.
Favouriting a message in Twitter is usually just to show that the user appreciates the message. This is done on Twitter via the favourite button. In Figure 1.1↑ this is represented by a star icon with the number one to its right. This tells other users that this message has been favourited once. Favouriting has very little impact on other users and so it doesn’t really fit into the P2P realm apart from sending the notification back to the original user that one of their Tweets has been favourited no other user is involved. If the message itself carried information about how many times that tweet had been favourited then it would leave the number open to tampering. Some user with insincere motives could simply edit the number of retweets to unrealistically represent the number of times it has been favourited.
Replying to a Tweet is also very tricky. Replies appear as normal Tweets from a user, sometimes with original Tweet included but other times without - usually due to the 140 character limitation. Twitter keeps track of replies and will often show them under a Tweet [F]  [F] For an example see: https://twitter.com/BitTorrentSync/status/514504734531940352. Tracking which tweets have been replied to and then collating them with their original Tweets is something that is much better suited to a large central database as you would need the ability to tell other people who follow someone that a tweet has been replied to.
Finally, direct messaging allows users who have opted into following each others updates to share messages with each other privately. In a P2P environment such as BitTorrent Sync this can be achieved by simply adding a separate folder which both users have read/write privileges on. The two users will be the only users with access to that folder and, as such, the messages will remain private. A drawback to this is that it requires both users to be online simultaneously so that the data may be exchanged. A potential solution to this problem would be to have the data synchronized with other users but remian encrypted on their drives. This does require some level of cooperation and ultimately selflessness. There may also be legal implications if the users are helping to share illegal content even without their knowledge.

1.1.2 Facebook

figure Screenshots/facebook-group.png
Figure 1.2 A Sample Facebook group post from the page RHODES SRC
A Facebook group offers a few other features that aren’t present in Twitter. Facebook offers communities in the form of groups and pages. Figure 1.2↑ shows a sample post on the Rhodes Student Representative Council’s Facebook group. This post was made by a person who has joined (opted into receiving updates from) the group and is then visible to all users who have also subscribed to that group. Groups offer a huge amount of manageability to their administrators and moderators. This kind of manageability is an inherent advantage of the centrally managed systems over P2P. Implementing a rudimentary group using BitTorrent Sync simply involves allowing multiple users to have read/write privileges on a shared folder that they are all set to synchronize with. Adding other features further complicates things.
figure Diagrams/thesis_group.png
Figure 1.3 A proposed method for creating P2P group styles.
Figure 1.3↑ describes a proposed method for creating Facebook-style groups where an administrator is responsible for allowing messages to be posted to the group. Once the administrator has allowed the post it is then shared in the same manner as before via BitTorrent Sync and the BitTorrent protocol. The problem with this system is that it now requires the administrator to be online whenever someone makes a post in order to allow or bar said post.
In Figure 1.3↑ the Administrator is represented by the large rectangle above the users (the smaller squares). The main way to see and synchronize posts is via the area marked Group Contents. If a user wishes to make a post they it will be placed in the folder denoted by the smaller rectangle inside their user space in Figure 1.3↑. This folder is only synchronized with the administrator and is denoted by the smaller squares withing the administrator space. If the administrator decides that this post is fit to be shared with everyone then they will move it out from the folder corresponding to the original user and into a special folder for all post. Posts into this folder will then synchronize with all other users. This system could possibly be improved by adding more administrators. Once again, as described in subsection 1.1.1↑, other users could help to alleviate the need for administrators to be constantly online by keeping encrypted mirrors of the new tweets on their drives until the administrators apporove or deny them..
figure Diagrams/thesis_group_multi_admin.png
Figure 1.4 An improved group administrator system (similar to Figure 1.3↑)
Adding more administrators means that there is no longer a single point of failure. A bottleneck could also arise if the single administrator were to receive many posts at once and so having more than one person to do the job would help.

1.1.3 Overall Scalability

Contemporarily, one of the greatest challenges to any system designer is that of creating a system which scales dynamically with demand. With the advent of services like Amazon’s Elastic Cloud Compute (ECC) [G]  [G] http://aws.amazon.com/ec2/ systems have the infrastructure to grow to their users demand. Designers need to also create systems which are able to scale within the infrastructure such as if more computing power is added to a datacentre. The BitTorrent protocol has an inherent scalability (See Figure and Subsection 2.6.1.1↓) which means that applications built on top of it do too. This thesis aims to show how well the Twitter- and Facebook-like applications scales.

1.2 Peer-to-Peer related problems

An inherent problem with P2P systems is that they are generally very difficult to manage. Because there is no central server to do overreaching inspection each client must keep note of whether data they they are receiving from other clients is in line with what they are expecting. Traditionally this has been done by creating a digest (usually a 20-byte SHA1 hash) of each piece of the transaction [5] [H]  [H] Specifically see: http://www.BitTorrent.org/beps/bep_0003.html#info-dictionary. Downloaded pieces are then hashed and those hashes are compared with the original.

1.3 Summary of Challenges

As the P2P approach is relatively unexplored it will make this project even more of a challenge. There hasn’t been much research on the subject nor has there been corporate interest because of the difficulty in monetizing P2P systems. Furthermore, an issue with P2P systems is user public adoption. Users need a reason to leave their current systems and adopt the one that you are trying to promote. As can be seen with current statistics [27], existing social networks are very integrated within the users everyday lives and privacy does not seem to be a very large concern.
Another challenge is developing systems that mimic existing systems with a completely different approach to their functioning. This needs to be to least possible inconvenience of the user as possible.
Further challenges include using the BitTorrent Sync API whilst it is in its infancy.

2 Literature Review

2.1 Introduction

In this document I will go through the literature surrounding BitTorrent Sync. Released in 2013 [9], BitTorrent Sync has had very limited literature written about it owing to how young the technology is. As such I will cover as much as I can find directly related to BitTorrent Sync and then look deeper into the BitTorrent protocol as well as going into other protocols that achieve similar results such as the rsync.
Furthermore, I will look into other factors that concern my project such as the scalability of a system based on the BitTorrent protocol as well as security concerns. Since cloud-based systems, such as Dropbox and Google Drive, have drawbacks such as security and privacy concerns [9] and charging for using more than the allocated space quota, it is useful to study alternatives such as BitTorrent Sync.

2.2 Introductory Concepts

2.2.1 Networked vs Distributed Systems

“A distributed system is a collection of independent computers that appears to its users as a single coherent system.”
[22]
A distributed system is more than just what is offered by a networked system. Unlike a networked system where every machine needs to be independently managed, a distributed system is built on top of many networked machine to form a layer of abstraction; these machines are autonomous. To the user this layer appears as a single system (Figure \reffig:A-distributed-system) and for all intents and purposes functions as such. The distributed system should also help to abstract from the hardware to the point where the the location of the hardware is almost irrelevant.
figure Diagrams/thesis_distributed.png
\caption[A distributed system.]A distributed system. \citeTanenbaum2002
An example of a distributed system might be that of a server room within a company. Employees of that company might have their own personal computers on which they work but if that system needs to execute a command it will find the best place to do so. The best place could be their computer, the computer of a colleague or in the server room. To the user the entire system acts like a single entity. It is a distributed system.
Distributed systems can have the advantage of scaling very well (See Section 2.8↓) compared to that of purely networked systems. A distributed system can be designed with choke points either unintentionally or in order to help manage the system. This does mean that there is a tradeoff of degradation of service or increased cost.

2.2.1.1 Naming

Naming in a distributed system is not specific to location as it is with a networked system. In a distributed system you can simple refer to a service by its name without specifying a location. The location of the service is not a naming issue so much as it is an issue of performance.

2.2.2 Single-Source vs P2P

Single-source is a misleading term. It is often used to refer to a client-server relationship in which users are independent clients which request data from a server. The server itself may be built on top of a distributed system or be a single machine. The client requests a service from the server. The server processes this request and the provides the service. The server then responds to the client.
If the same request is shared by many clients then the clients are referred to as distributed. These distributed clients can still interact with the system providing the service however they may also work independently of and central service. This type of system is known as P2P.

2.2.3 Other services

Some distributed systems offer services that are independent of the client-server model. A service such as Akamai [I]  [I] http://www.akamai.com/ offers service providers the chance to imporve the responsiveness of their service by moving their service physically closer to their customers.
Users who wishe to access the website of a service (that has been correctly configured to use Akamai) will request the webpage via a web browser. This webpage request will be processed by Akamai which will determine which of their chaching servers is best to send the request to. The caching server will be chosen for its prozimity to the user as well as the current load in an effort to decrease the time it takes for the user to recieve the reply. This can be seen as a distributed system that is not directly run by the provider of the service being requested.
The locations of each of the Akamai caching servers can be seen in Figure 2.2↓ [J]  [J] http://www.akamai.com/html/about/locations.html.
figure Screenshots/thesis-akamai-locations.png
Figure 2.2 Locations of Akamai servers throughout the world.

2.3 BitTorrent

BitTorrent is a peer-to-peer (P2P) based protocol used for transferring files between computers without those files needing to travel through or be stored by a central server. The protocol was originally designed by Bram Cohen in 2001 [4]. According to his specification, [5] designed the protocol because he was unhappy with the way that HTTP handled file downloads. By the HTTP protocol a single file will be transferred from the server to the client. Cohen designed BitTorrent to support multiple users downloading the same file and sharing that file with each other at the same time. Because of this, the protocol removes much of the load from the server and distributes it among the various peers.
For a BitTorrent transaction to occur the following is needed: a Web server to hold the metafile, a BitTorrent tracker, a user who originally has the file and wants to share it with others (known as the original downloader), a Web browser (or some other method of obtaining the meta data file) and finally some kind of download client such as rTorrent [K]  [K] http://libtorrent.rakshasa.no/, µTorrent [L]  [L] http://www.utorrent.com/, Vuze [M]  [M] http://www.vuze.com/, Transmission [N]  [N] http://www.transmissionbt.com/ or one of many others.

2.3.1 Metainfo Files

Metainfo files are the way that a server gives the client all of the information about the file that it is going to download. The metainfo file consists of dictionaries with the following keys: announce and info. The announce key points to the URL of the tracker(s) to be used and the info key points to a further dictionary. Very little of the metafile is human readable but the following is an example of part of the ubuntu-12.04.4-desktop-amd64.iso.torrent metafile that is used as one of the ways to distribute the Ubuntu operating system [O]  [O] http://www.ubuntu.com/. An example of a metafile can be seen in Listing 2.1↓.
d8:announce39:http: ⁄  ⁄ torrent.ubuntu.com:6969 ⁄ announce13:announce − listll39:http: ⁄  ⁄ torrent.ubuntu.com:6969 ⁄ announceel44:http: ⁄  ⁄ ipv6.torrent.ubuntu.com:6969 ⁄ announceee7:comment29:UbuntuCDreleases.ubuntu.com13:creationdatei1391706822e4:infod6:lengthi768606208e4:name32:ubuntu − 12.04.4 − desktop − amd64.iso12:piecelengthi524288e6:pieces29320:...
Listing 2.1 Metainfo file contents
As you it can be seen see there are multiple announce keys that all point to different trackers. Trackers are the only single point of failure in a BitTorrent system and so having multiple of these makes room for redundancy. There is then a place for a comment, and the creation date of the metainfo file. What follows forms part of the information section. There are fields for the length of the file as well as the length of the individual pieces of the torrent file and the number of pieces. There is also the name field. This recommends a name for the file for when it is saved on the disk.

2.3.2 Trackers

Trackers are the backbone of the BitTorrent operation. They coordinate and keep track of all of the nodes. A client may interact with a tracker using standard HTTP. They respond to GET requests [5].
The get requests have many keys, which the downloader can use to announce itself to the tracker as well as keys to let the tracker, know how close the downloader is to completing the download.

2.3.3 Peer Protocol

Peers communicate with each other through a BitTorrent transaction via the peer protocol [5]. Once a peer completes a piece of a file, it will announce that completion to its peers so that they may download that piece from that client if they need it.

2.3.3.1 Choking

Choking is the method by which the BitTorrent protocol introduces efficiency and fairness into the system. The BitTorrent protocol unchokes four peers at a time. It decides which peers to unchoke according to download rate [6]. Peers usually choose which other peers to unchoke based on their download rate. The download rate is based on a 20s rolling window determined by the client.
Part of the problem with choking is that a peer may be ‘snubbed’. This occurs when it is choked by all of its peers. BitTorrent determines that a peer is being snubbed when it does not complete a piece in over a minute [6].

2.3.3.2 Completion

Peers who have completed their download can no longer determine which peers to upload to based on their download speed. They thus decide to upload to peers who themselves have the best upload rate. This ensure that the maximum upload capacity is used [6].

2.3.4 Incentives in Peer-to-Peer Networks

A paper by [15] describes how they developed a BitTorrent client (Bit Thief) to effectively free ride a download. This refers to downloading files whilst uploading absolutely nothing in return. One of the methods that they described was announcing that the client has a piece available but when asked for that piece they upload randomised data instead of the actual piece that the downloader is expecting. This worked at first but soon the peers would block them as they checked the files against a hash. Bit Thief is available to the public [P]  [P] http://www.bitthief.ethz.ch/. This type of client has the potential to ruin an incentive based file sharing system as it eliminates a good reason for users to upload (maintaining a good download speed).
Incentivising the cooperation of peers in P2P networks is one of the core challenges for anyone developing a P2P system. Sharing communities are groups that have emerged for the goal of ensuring high quality downloads as well as enforcing fair downloads. These sharing communities enforce strict upload/download ratios to make sure that their members share as much as possible [1]. Traditionally clients are unlikely to want to share a file once they have completed their download as it costs them resources such as bandwidth and processing time. A Prisoner’s Dilemma occurs [12] In the paper by [12] they construct the payoff matrix given in Figure \reffig:payoff-matrix.
Server
Allow Download Ignore Request
Client Request File 7/-1 0/0
Don’t Request 0/0 0/0
\caption[The payoff matrix describing a typical P2P transaction.]The payoff matrix describing a typical P2P transaction. \citeplai2003
As you it can be seen in \reffig:payoff-matrix the best results for the client are when both the server and the client cooperate. This is the worst result for the server.

2.4 BitTorrent Sync

As noted in Section 2.1↑, very little has been published on BitTorrent Sync itself. A recent paper by [9] covers a basic introduction to the technology but focuses on digital forensic investigations, which is covered under Security (Section 2.9↓).
BitTorrent Sync can be seen as an alternative to many commercial, cloud-based remote backup and mirroring services (often referred to as cloud synchronization services) such as Google Drive [Q]  [Q] https://drive.google.com/, Microsoft OneDrive [R]  [R] https://onedrive.live.com/ and Dropbox [S]  [S] https://www.dropbox.com/. It differs in a few, major ways. User data is not instantly available online, it will not be backed up in a data centre, but the size of the synced files is only limited by the users hard drive space.
BitTorrent Sync has some advantages over the ‘traditional’ cloud synchronization services. According to the BitTorrent Sync User manual [2, 9]: BitTorrent Sync supports clients on many different environments (including desktop and mobile), there is no charge for bandwidth or limitations on the size of your files, backup happens automatically in a manner similar to commercial products without user input, end-to-end data encryption is offered via the RSA encryption standard as well an option to set remote encryption if the user wanted to store their data in a location that they may not trust.
Upon installation on a Microsoft Windows-based, Mac OS or unpacking on a Unix-based machine, BitTorrent Sync creates three files in the folder that have been chosen for synchronisation. In a test that was run, the /home/kieran/download directory was chosen as a sync directory. Listing 2.2↓ shows the contents of a directory set to be synchronized.
kierang11h3779 − 1: ⁄ downloadsls − latotal3560drwxr − xr − x4kierankieran4096May2309:59.drwxr − xr − x32kierankieran4096May2216:08..drwxr − xr − x3kierankieran4096May2309:56btsyncx64 − rw − r −  − r −  − 1kierankieran3615069May2309:56btsyncx64.tar.gzdrwxr − xr − x2kierankieran4096May2309:59.SyncArchive − rw − r −  − r −  − 1kierankieran20May2309:59.SyncID − rw − r −  − r −  − 1kierankieran296May2309:59.SyncIgnore
Listing 2.2 The sample contents of a folder set to be shared with BitTorrent Sync.
Note that this is the same folder that held the original tarball and as such contains the extracted directory too (btsync_x64). The three generated files are [9]:

2.4.1 Share Keys

Whenever a new BitTorrent Sync share is made, new keys are created. One of the keys is the master key or full read/write key [9] and another is a read-only key whilst the third is a key that allows for synchronization but all data remains encrypted. Anyone with this key (usually the owner) has the ability to both read and change the contents of the shared directory. These keys are stored in the sync.dat file that is populated once the new share has been created. The BitTorrent Sync API Documentation [T]  [T] http://www.BitTorrent.com/sync/developers/api states that further secrets can be obtained using the get_secrets parameter with the option to give the the master key. It will return three more keys in Javascript Object Notation (JSON) format.
As an example, a shared folder was created with the following read/write key:
A7BLTFEQ2M36EXHXEWRGW6KUVC73T2PSS
With,
http: ⁄  ⁄ [address]:[port] ⁄ api?method = getsecrets[ amp;secret = (secret) amp;type = encryption]
the following was returned:
"readonly":"BKVWKOTNX3PTKDXKOKOU7YXMVT7R3D6RX", "readwrite":"A7BLTFEQ2M36EXHXEWRGW6KUVC73T2PSS"
These keys may be given to other peers so that they can access the share too. An encryption key is also available. It allows a peer to download a shared directory but it will remain encrypted on their drive. This allows you to back up data to a cloud storage provider and still maintain secrecy. Fourth and fifth keys exist that are only allow read/write or read-only privileges for 24 hours as given below:
CADP2MQADRJ7K3I4EHG533T7N7XC5XGFX
A QR is also generated which allows users to easily share the sync keys with users on mobile devices through the camera interface. A sample QR code can be seen in Figure 2.3↓.
figure Diagrams/QRCode.png
Figure 2.3 A QR code that can be used to share keys between peers

2.4.2 Cloud Synchronization Services Transaction

figure Diagrams/Cloud-Synchronization-Service.png
\caption[Typical cloud synchronization service]Typical cloud synchronization service \citefarina2014
A typical cloud synchronization service can be simplified into the following steps (See figure )[9]:

2.4.3 BitTorrent Sync Transaction

figure Diagrams/BitTorrentSync-Transaction.png
\caption[Typical BitTorrent Sync Transactions]Typical BitTorrent Sync Transactions \citefarina2014
According to [9], there are five different ways that BitTorrent Sync can discover peers and transmit traffic. These are very similar to the way that the standard BitTorrent protocol does this.

LAN Discovery

As noted in path A (Figure ), this method of discovery searches the local network for peers. It does so by sending a broadcast packet to all of the local network on port 3838 [9]. The packet includes the sender’s IP, the port and the ID of the file(s) that it is trying to share.

Peer Exchange

Once peers have established a connection, they take part in peer exchange (PEX). PEX involves the peers exchanging information about other peers in the swarm.

Tracker

According to [9], in this mode the BitTorrent Sync application will request a peer list from a tracker located at t.usyncapp.com. t.usyncapp.com is resolved in Listing 2.3↓.
kierang11h3779 − 1:nslookupt.usyncapp.comServer:146.231.129.97Address:146.231.129.97#53Non − authoritativeanswer:Name:t.usyncapp.comAddress:54.225.196.38Name:t.usyncapp.comAddress:54.225.92.50Name:t.usyncapp.comAddress:54.225.100.8
Listing 2.3 Nslookup results for t.usyncapp.com.
As denoted in path B (Figure ), the seeder sends a request to the server to have a list of peers. The tracker adds the seeder to the active peer list. Now any leecher can request a peer from the tracker and, provided that a matching ShareID is found, download the file.

Distributed Hash Table

A distributed hash table (DHT) allows peers to tell other peers what it is sharing; note that these peers do not necessarily have to form part of the swarm. This spreads the hash table out thereby decentralizing it.

Known Peers

As [9] describes, “[this is the] least detectable method of peer discovery.” In this method the peers explicitly set a list of known IP addresses that should be consulted for potential downloads.

2.4.4 Relay Servers

Relay servers are in place in case two clients are unable to communicate with each other. This may occur if a peer has been explicitly banned via an Access Control list. BitTorrent Sync will try to bypass the block by routing the traffic through a relay server. The server is located at r.usyncapp.com [9]. Listing 2.4↓ describes the results of the nslookup command.
kierang11h3779 − 1:nslookupr.usyncapp.comServer:146.231.129.97Address:146.231.129.97#53Non − authoritativeanswer:Name:r.usyncapp.comAddress:67.215.229.106Name:r.usyncapp.comAddress:67.215.231.242
Listing 2.4 Nslookup of r.usyncapp.com.

2.5 Previous Work

2.5.1 Vole.cc

A group based out of both the United States and South Africa has developed a bare-bones social network called Vole.cc [U]  [U] http://vole.cc/. It follows a Twitter-like [V]  [V] https://twitter.com/ approach where you subscribe to various people and receive any posts that they make.
The application uses BitTorrent Sync’s application programming interface (API) and the user interface is accessed via a web browser.
This is a very similar concept to what has been implemented as a first application. It differs in a few ways. I would like to eliminate the web browser entirely and implement a more traditional window based program. Vole.cc does not delve too deeply into other social network aspects such as group discussions, like Facebook groups and Google Groups [W]  [W] https://groups.google.com/forum/#!overview. It also does not go too deeply into real-time social networking applications such as any chat system. Real-time applications are not common in BitTorrent.
A Wired Article [X]  [X] http://www.wired.com/2014/02/bittorrent-sync/ has said that BitTorrent Sync, and its spin-offs have been created as a way to avoid snooping (especially by governments). Although they may operate similarly to other, server based, social networks, they do not centrally store data and can offer end-to-end encryption (See Section 2.9↓).

2.5.2 SyncNet

Another application of BitTorrent Sync is SyncNet [Y]  [Y] http://jack.minardi.org/software/syncnet-a-decentralized-web-browser/. Developed by Jack Minardi, it is a web browser that uses BitTorrent Sync to decentralise the web. When a user downloads a website off of the original server they store a copy of it on their device. As more and more people access the website its files become more and more spread out and decentralised. Because of this, if a someone were to shut down the original site it would be almost impossible to stop users from sharing it.
As of writing this, SyncNet only works on sites with static content. That is to say that once you publish the content you cannot change it.

2.6 Similar Technologies

As the following thesis deals with the creation of a distributed social network using BitTorrent Sync as a method of transferring files between clients, this section looks at other technologies that try to achieve similar results.
Peer-to-peer systems have seen a major increase in use as the bandwidth available to the average user increases. P2P sees users sharing both storage and CPU cycles (SETI@Home [Z]  [Z] http://setiathome.ssl.berkeley.edu/) [21].

2.6.1 Tribler

In a paper by [20] the authors argue that when peers in a swarm are socially connected (have some connection outside of the transaction) they are less likely to free ride [15, 32]. [20] propose three key areas in a P2P system which challenge the entire technology.
The first challenge is decentralization. Although progress has been made in decentralizing systems like Bittorrent [16], much of the world’s BitTorrent traffic relies on central systems like trackers.
Another issue relates to availability. Because P2P systems function like grassroots movements, their availability is subject to the availability of the peers. In another paper [19] also find that only 4% of P2P users have their clients on for more than 10 hours at a time. This implies that they will download a file and then close their client once the download has completed. [20] believe that “social incentives” could help improve peer availability.
The third issue proposed relates to integrity. [20] say that in a socially structured P2P system users would be more inclined to help clean data and report misuse.
The crux of the Tribler system is the de-anonymization of peers. In a traditional BitTorrent Transaction peers are identified by their IP addresses alone. Tribler seeks to create social groups around similar interests. Users select nicknames so that they can be easily identified with other users. Tribler also helps to connect peers by use of “Taste Buddies” which lists users with similar tastes in files. “Taste Buddies” are determined by which files a user downloads or by the “Files I Like” category.
Users are uniquely identified by PermIDs [20]. PermIDs make use of public-private key encryption to identify users.

2.6.1.1 Shared Downloading

The Tribler designers have created a collaborative download protocol that relies heavily on the social aspect of the technology. 2Fast [10] allows users to ask their friends to help them download a file. 2Fast separates peers into either collectors or helpers. Collectors being the peers who wish to download the complete file and helpers are usually the friends of the collector. The protocol works by making the helpers ask whether the chunk that they are about to download is unique to the collector (i.e. no other helper is downloading the file) and once it has completed the download it will send it to the collector asking for nothing in return. [20] tested the 2Fast protocol on varying line speeds and for varying numbers of helpers:
figure Diagrams/tribler.png
\caption[Resulting speed increase from having helpers in a download]Resulting speed increase from having helpers in a download. \citeppouwelse2008
With a 8 Mbps download speed it can be seen that 2Fast results in just under a sixfold speed increase over a non-shared download.

2.6.2 Maze

Maze was designed to supplement CERNET [A]  [A] www.cernet.net/, China’s education and research network. According to [26], users said that they were unhappy with the speed and availability of the FTP servers that were hosting the content.
Maze functions in a similar style to the old Napster [B]  [B] www.napster.com/, LimeWire [C]  [C] http://www.limewire.com/ (discontinued) and the Direct Connect [18] protocol. Maze assigns metadata to each file which allows for deeper searching of files. Indexes of the files are stored on central index servers which allow for the searching of files, even those currently offline. Clients report when they come online. Maze tries to reduce latency by connecting you to peers who are closer based on the last 24 or 16 bits of their address [26].
Maze has introduced a novel way of incentivising sharing. They offer points to users for uploading and deduct points for downloading. A user with higher points will also have a higher download priority if it is queued for a download. Users below a certain threshold have their download speed throttled. [26] notes that users with higher points have a prestige attached to them. They also note that people become competitive with each other to obtain more points, trying to beat each other to sharing files.
Users of Maze have been caught misusing the system in an effort to artificially boost their points. Reports say that users often switch identities when their points fall below the throttling threshold. Some users have also been caught using search engine optimization techniques to make their files show up in more search results.

2.6.3 Gnutella

According to [21], Gnutella clients are known as servents as they act as both SERVers and cliENTS. Each servent is able to issue and receive queries for data as well as view search results. To join the system, nodes need to first connect to a known host (some may be found here [D]  [D] http://gwebcaches.pongwar.com/gnutella.html). Messages may be sent by either a broadcast or by back-propagation. Each message has a unique identifier and each node keeps track of the last few messages that it passed on thus limiting the amount of resending of messages. Messages are also given time-to-live fields to prevent them from eternally being passed on.

2.6.4 Distributed Internet Mail

In a paper by [17], the author describes a system for implementing a distributed Internet mail. The paper discusses aspects from Internet security to how the mail system will function.

2.7 Community driven alternatives

The Free Software Foundation has listed software projects which they deem to be of high priority and importance [14]. Of the many projects that they have listed, a free software replacement for BitTorrent Sync is one of them. The drive towards an alternative has been spearheaded by a Libre Planet (a space for organising free software development) group[13]. The Libre Planet page contains a curated listed of the best alternatives to BitTorrent Sync. The page separates the solutions into three separate categories:
These applcations may come in useful for future work on distributed applications that rely on more functionality than what is offered by the BitTorrent Sync API.

2.8 Scalability

Peer-to-peer (P2P) architectures are inherently more scalable than their client-server alternatives. They do, however, suffer from other downfalls. [11] give a very good example of how this is true.
In their example they are trying to measure the difference in speed between a single server uploading to many clients successively and a single original uploader sharing with many leechers. They start by assuming the Internet core has infinite bandwidth and that any bottlenecks occur at its extremities. In the client-server model, the server needs to send each file to every peer individually. A bottleneck may occur when a client has a particularily slow download speed and so all clients after it are delayed in starting their downloads. Given N peers and a file of length F bits, the server needs to send NF bits. If the server has an upload rate of us, the total time to upload all files to all servers is NF ⁄ us. If each client has a maximum download speed, the client with the lowest maximum download (Dmin) speed will dictate the maximum transfer time; that being F ⁄ dmin. That is, the minimum distribution time. The distribution time is thus:
Dcs ≥ max(NF)/(us), (F)/(dmin)
This is the lower bound of the distribution time, and for a large enough N, our distribution time is linear in N.
For a P2P approach, each peer can share a file once it has finished downloading it. Again according to [11] the P2P method starts off the same way as the client-server approach, namely the initial transfer between the original seeder and the first downloader is given by F ⁄ dmin. The total upload rate of the system is given by the sum of the upload rates of each peer: utotal = us + u1 + … + uN. Thus the distribution time is:
DP2P ≥ max(F)/(us), (F)/(dmin), (NF)/(us + Ni = 1ui)
Once again it is safe to assume that a system can reach this lower bound.
figure Diagrams/scalability.png
\caption[The difference between client-server and peer-to-peer architectures in terms of transfer time]The difference between client-server and peer-to-peer architectures in terms of transfer time \citekurose2013

2.9 Security

According to their technical information [3], BitTorrent Sync uses the Secure Remote Password protocol (SRP). Developed by [25], SRP is designed to allow even very easy to guess passwords a very high level of security that makes it tough against man in the middle attacks and eavesdropping.
BitTorrent Sync provides end-to-end encryption as well as encrypted saving. This allows the user to save files in an encrypted format on a server that they may not trust.
BitTorrent Sync is, however, a closed source application and so the security of it only extends as far as the user trust the developers with it.

2.10 Alternative Protocols

The are many alternative technologies that can be used instead of BitTorrent Sync. One particular competitor to BitTorrent Sync is the popular Linux tool, developed by Samba, rsync [U]  [U] http://rsync.samba.org/ [24]. According to the rsync manual page:
rsync(1)    rsync(1)
NAME        rsync - a fast, versatile, remote
            (and local) file-copying tool
\endsloppy
rsync is very simple to use. To use it one simply specifies the file that the user wants to move, the remote host and then the remote location. An example of that would be:
rsync /home/kieran/myfile.txt
      kieran@otherpc.ru.ac.za:/home/kieran/myfile.txt
Bittorrent Sync has a few advantages over rsync. BitTorrent Sync, via the BitTorrent Protocol, has proven that it can transfer files effectively between hosts that do not necessarily know each other before the transfer starts. It has the means in place to associate peers with each other that is not natively built into rsync. All that functionality would have to be put into place in order to make rsync comparable.
BitTorrent sync is also inherently scalable. If you had a peer who wanted to send a file to many other peers it would not cost that peer much bandwidth as the other peers would help with the sharing; rsync does not have this kind of capability and so could cost the original uploader large amounts of bandwidth to share that file with many people.
rsync does not natively work well with computers who have dynamically assigned IP addresses (some kind of dynamic DNS needs to be put in place to provide a static location to send the files) nor with computers who are on a NAT subnet.
rsync does not provide its own security either; it chooses to rely on the security of the protocol through which the data is being send. If you send messages via SSH then they should be secure whereas messages sent via an unencrypted protocol would not be secure.
Finally BitTorrent Sync has implementations on most major platforms which means that developing on those platforms will be much easier.

2.11 Summary

In summary I feel that they key issues relating to a distributed file system using BitTorrent Sync are aspects such as: security, data availability, data integrity and peer management.
From this literature review you can tell that the amount of research directly relating to BitTorrent Sync is very minimal. There is, however, an abundance of information on BitTorrent and various protocols stemming from and similar to BitTorrent. Sections such as Incentives in Peer-to-peer Networks (Section 2.3.4↑), which are of an academic interest, should not be that important in a social networking context for reasons which are covered in that section I believe that this is a solid base to create various social networks which operate on the strong base of BitTorrent Sync.

3 Research Methodology

3.1 Choice of language

The python [V]  [V] https://www.python.org/ programming language was selected for use in the development of the two applications. It is not the language that the developer is most experienced in however it does offer ease of development that is seemingly unrivaled by other languages. Python is dynamically and strongly typed, object orientated and portable. Its main feature, however, is its popularity and, therefore, the number of external libraries (Section 3.2↓ ) available to the developer. There is no BitTorrent Sync API specifically for python. The API is communicated with through HTTP Get requests.

3.2 External Libraries used

The following external libraries were used:

3.2.1 Tkinter

Tkinter [W]  [W] http://tkinter.unpythonic.net/wiki/ is one of the most widely used python graphical user interface (GUI) packages. Tkinter forms a layer between python and Tcl/Tk software which has support on most of the major platforms making this project fairly portable [23].
Tkinter is a more complex wrapper than that of say python-btsync (Subsection 3.2.3↓). A python program communicating with tkinter makes the following calls down the stack [X]  [X] https://docs.python.org/3/library/tkinter.html#how-tk-and-tkinter-are-related:

3.2.2 SQLite3

SQLite is a lightweight, fast and reliable disk-based database system that uses simple file storage. It also employs a slight variant of the SQL query language making the development process unique but familiar.
A SQLite database was used to store the secrets/keys associated with the various folder that are to be synchronized between the various peers in the network. The table used for this purpose has another column for an alias to be associated with a share to make it easier for the user to identify a specific share. More information on the database can be found in the listing 4.1↓. The python programming language comes, by default, with libraries that allow it to communicate with a SQLite database. Calls to this library were extensively used throughout the developement of the project.
Once again the SQLite library is actually a C library with a python wrapper. Making use of the SQLite library is a straightforward process given in Listing 3.1↓ [Y]  [Y] https://docs.python.org/2/library/sqlite3.html.
import sqlite3
conn = sqlite3.connect(’example.db’)
​
c = conn.cursor()
​
# Create table	
c.execute(’’’CREATE TABLE stocks (date text, trans text, symbol text, qty real, price real)’’’)
​
# Insert a row of data
c.execute("INSERT INTO stocks VALUES (’2006-01-05’,’BUY’,’RHAT’,100,35.14)")
​
# Save (commit) the changes
conn.commit()
​
# We can also close the connection if we are done with it.
# Just be sure any changes have been committed or they will be lost.
conn.close()
Listing 3.1 Using the sqlite3 python library.

3.2.3 python-btsync

A python wrapper for BitTorrent Sync was developed by Jack Minardi [Z]  [Z] https://GitHub.com/jminardi/python-btsync (See subsection 2.5.2↑), and GitHub users probar [A]  [A] https://GitHub.com/probar and Ademan [B]  [B] https://GitHub.com/ademan. It was further extended for the purposes of this project in order to make use of more of its features.
The original implementation of the python-btsync wrapper implemented the following methods for use within a python application; methods added for the purpose of this project are in uppercase:
An example of one of the methods is as follows:
 def get_folder_peers(self, secret=None):
    """
    Returns list of peers connected to the specified folder.
    [
        {
            "id": "ARRdk5XANMb7RmQqEDfEZE-k5aI=",
            "connection": "direct", // direct or relay
            "name": "GT-I9500",
            "synced": 0, // timestamp when last sync completed
            "download": 0,
            "upload": 22455367417
        }
    ]
    http://[address]:[port]/api?method=get_folder_peers&secret=(secret)
     secret (required) - must specify folder secret 
    """
    params = {’method’: ’get_folder_peers’}
    if secret is not None:
        params[’secret’] = secret
    return self._request(params)
Listing 3.2 A sample method from python-btsync.
A request is then executed as follows [C]  [C] Jack Minardi: https://GitHub.com/jminardi/python-btsync/blob/master/btsync.py#L196:
def _request(self, params):         
	params = urllib.urlencode(params)         
	self.conn.request(’GET’, ’/api?’ + params, ’’, self.headers)         
	resp = self.conn.getresponse()         
	if resp.status == 200:             
		return json.load(resp)         
	else:             
		raise RuntimeError(’{}: {}’.format(resp.status, resp.reason))
Listing 3.3 Jack Minardi’s python-btsync _request method.
python-btsync offers all of the functionality needed to interact with the BitTorrent Sync API. Furthermore it adds a good amount of abstraction to the problem which has allowed for an easy method of reimplementing a different BitTorrent Sync application such as with subsections 4.2↓ and 4.3↓.

3.3 Design choices

A coonscious descision was made early on in the development of this project that the developers would avoid using a web interface to present data to and provide interaction with the users. This decision was made as a way of distinguishing the distinctly peer-to-peer programs from their client-server counterparts. With the advent of the world wide web in the early 1990s many technologies were built on top of this technology. The web was designed with the traditional client-server model in mind and as it has progressed into what we have today so have many other technologies emerged which all make use of it.
It is with this idea in hand that the designers decided to distance themselves from the web as people know it. The designers hope that this would help to, at least at a fundamental level, provide some distinguishing features for the user. Vole.cc (Subsection 2.5.1↑) is an example of a P2P application that is accessed via a web interface.
The applications that have been developed live entriely within the space of a distributed system (Subsection 2.2.1↑). This contrasts quite will with a web application. Web applications (such as visiting a web page) appear to be networked applications which are accessed via a URL (which is then processed into a IP address) or just an IP address but are actually just that on the surface. Modern, large, scalable Web applications rely on massive distributed applications for delivery of content.

4 Results

Two applications were developed for the purpose of this project: one resembling a Twitter client and the other Facebook. It was an initial design goal to not use any kind of web-based user interface in an attempt to avoid associating the applications too heavily with the traditional client-server models employed by most web applications. The specification that the developers should not user should not use a Web browser was not employed by the developers of vole.cc (Subsection 2.5.1↑).

4.1 BitTorrent Sync and BitTorrent Sync API

BitTorrent, the creators of BitTorrent Sync have made an application programming interface (API) available to selected developers who wish to create applications based off of BitTorrent Sync. The API makes available the following methods to programmatically interact with BitTorrent Sync:
This API is interacted with via HTTP GET requests and the results are returned in JSON format. A typical call to the API would look something like the following [D]  [D] http://www.BitTorrent.com/sync/api#getFolder:
http://[address]:[port]/api?method=get_folders[&secret=(secret)]

4.2 BitTwit

In this application we attempted to replicate some of the functionality available in Twitter using BitTorrent Sync and the Bittorrent protocol. Most of this functionality is discussed in subsection 1.1.1↑. Here we will discuss some of the functionality with an emphasis on actual implementation. The application was written in primarily in the python programming language and made use of the TK widget toolkit for the user interface. A simple database (SQLite, See Listing 4.1↓) is used to keep track of which users are are currently being followed (See Figure 4.2↓); it also associates an alias to those secrets.
Users are uniquely identified by their their share keys. It was decided to allow users to give other users aliases when they subscribe to their feed so that they are easier to identify. This is analogous to a user on Facebook chosing a name as well as giving an email address. The email address is the way, along with perhaps a user ID and username, that Facebook uniquely identifies a user.
$ sqlite3 -line BitTwit.db ’.tables’
following 
$ sqlite3 -line BitTwit.db ’PRAGMA table_info(following);’        
cid = 0       
name = ID       
type = int    
notnull = 0 
dflt_value =          
pk = 0
​
cid = 1       
name = alias       
type = text    
notnull = 0 
dflt_value =          
pk = 0 
Listing 4.1 The structure of the SQLite database.
figure Screenshots/bitTwitmain.png
Figure 4.1 The BitTwit main window
The BitTwit main window (Figure 4.1↑) is what users see when they start the program. It contains a few simple buttons to allow user interaction and some information relating to the program.
Along the top of the program the user is presented with three buttons. These buttons are labeled Follow, Refresh and Unfollow respectively. The bottom row offers the users three buttons for interaction. They are labeled followers, post and following. In the centre of the window is where the the traditional Twitter-like feed resides. In Figure 4.1↑ there are currently two posts, “hello there” and “me.” On the bottom of the window is some information relating to the program. On the left-hand side is the current SQLite database version number and on the right-hand side is the current speed (for downloads and uploads).

4.2.1 Follow a User

figure Screenshots/bitTwitfollow.png
Figure 4.2 The BitTwit follow window
Clicking on the follow button reveals the follow window (Figure 4.2↑). This window is where the user would go to follow someone new. Here the user is presented with two text boxes to fill in. The top one, labeled secret, is where the user puts in the unique secret or share key associated with the folder (see subsection 2.4.1↑). Since the share secrets are a cumbersome and ungainly way for most humans to remember and associate with another user (since these are the unique identifiers) the software allows the users to specify an alias. The alias is a non-unique identifier so that a person may associate an ordinary string of text to another use instead of the secret. These fields are stored within the SQLite Database (Listing 4.1↑) so that the alias may be used to replace the secret in everyday use.

4.2.2 Unfollow a user

figure Screenshots/bitTwitUnfollow.png
Figure 4.3 The BitTwit unfollow window
The BitTwit Unfollow window presents a list of the users that the current user has followed. Figure 4.3↑ shows a single user that has been subscribed to. On the left is the secret associated with the share. In the centre is the alias associated with the secret to make it easier to identify for the user. Finally on the right is the unfollow button. An unfollow button is created for each user that has been subscribed to previously in the application.
The refresh button will refresh the feed of Tweets so that any new postings since the last time the program was used will be show at the top. An automatic refresh is done when the program originally loads to make sure that the user is presented with the most up to date posts from the start.

4.2.3 See Followers

figure Screenshots/seeFollowers.png
Figure 4.4 The BitTwit followers window
Figure 4.4↑ shows the BitTwit followers window. This window attempts to show which users are currently following you. It relies on an integral part of the BitTorrent Sync API. It uses the Get Folder Peers API call which returns what is shown in Listing 4.2↓.
[ 
	{ 
		"connection": "direct", 
		"download": 0, 
		"id": "EMgVMmtE_x10I99eTHTrVsVnDKE=", 
		"name": "kieran-netbook", 
		"synced": 1413286104, 
		"upload": 0 
	} 
]
Listing 4.2 A Get Folder Peers API call return string.
Listing 4.2↑ shows the name field which is used within the BitTwit followers window. One of the problems with the API is determining which peers relate to which users. The API offers two identifiers when it lists peers associated with a share. The first identifier is the name. This is similar to the concept of a hostname in computer networking. This name does not have to be unique and is often the computer’s actual hostname. The other identifier is the id, this identifier appears to be some kind of hash digest although, at this time, due to the lack of documentation we are unsure of where this identifier is unique.
If the id field was for the tracking server to uniquely identify the peers within its own infrastructure then we could rely on this string to be unique and use it ourselves to uniquely identify followers. Because of the lack of documentation [E]  [E] http://www.BitTorrent.com/sync/api#getPeers we have to ignore this for the time being.
It would not be prudent to use the name given out by the API. Those names are subject to change and could confuse other users.
figure Screenshots/newpost.png
Figure 4.5 The new post window.
The new post window (Figure 4.5↑) allows for the users to create post. It is very simple to use. Users need only to type what they want to post and it will be added to their share to be synchronized with all of the people who are following them.

4.3 BitFace

figure Screenshots/bitFace-main.png
Figure 4.6 The main window of BitFace.
Figure 4.6↑ shows the main window of BitFace and what users will be presented with when they first open the program. The program looks and functions very similarly to that of BitTwit (Section 4.2↑). Along the top of the program the user is presented with three buttons: follow, refresh and unfollow. Underneath the three buttons is a drop-down list. This list allows the user to select which group they are currently viewing. In the centre of the program is space for all of the users posts to be displayed. Towards the bottom are three further buttons; they are: new group, post and following. Furthermore there is an information panel at the very bottom which gives information on the database version as well as the current transfer speed according to BitTorrent Sync.

4.3.1 Follow A New Group

figure Screenshots/bitFace-newfollow.png
Figure 4.7 The post window of BitFace.
The follow window (Figure 4.7↑) is almost identical to the BitTwit follow window (Figure 4.7↑). It gives the user a place to add the secret/key associated with the group which they want to subscribe to. The second option allows the user to assign an optional alias to the group so that is easier to identify. Clicking the follow button then adds the group to all of the other groups that the user follows so they will receive updates; that process is shown in Figure 4.11↓.

4.3.2 Remove A Group

figure Screenshots/bitFace-remove.png
Figure 4.8 The remove group window.
Removing a group that the users follows is done via the remove follow window (Figure 4.8↑). It lists all of the groups in the format secret alias. It also provides a button for each group so that the user may remove groups which they no longer wish to be part of. Removing a group that the user has previous subscribed to will remove that group and alias from the database. It will also delete the folder which previously contained all of the files which were to be synchronized with the various other users.

4.3.3 Post Into a Group

figure Screenshots/bitFace-newpost2.png
Figure 4.9 BitFace new post window.
Posting to a group with BitTorrent Sync is, again, very similar to how it’s done in BitTwit (Figure 4.5↑). As seen in Figure 4.9↑ the post window still offers a place for the user to write their post as well as the familiar post button. A new drop-down menu has been added that allows the user to select which group they wish to post into. The current state of the system is such that anyone may post into a group so long as they have a read/write key. This limitation has been discussed in Subsection 1.1.2↑.
Adding a new group (one that does not previously exist) is done by clicking on the New Group button. The new group button opens the create a new group window as seen in Figure 4.10↓). This window allows the user to specify an alias for the group, which is optional, and then create that group.

4.3.4 Create A New Group

figure Screenshots/bitFace-newgroup.png
Figure 4.10 The BitFace create new group window.
Creating a new group is very simple. The user needs to simply specify an alias if they so wish and the group will be created and added along with the other groups that they are subscribed to. The process of creating and added a group is outlined in Figure 4.12↓.
figure Diagrams/thesis_bittwit_followgroup.png
Figure 4.11 The algorithm for following an existing group in BitFace.
figure Diagrams/thesis_bitface_addgroup.png
Figure 4.12 The algorithm for creating a new group in BitFace.

5 Conclusion

5.1 Challenges

It has to be said that the largest challenge in deveoping this project was the poor documentation for BitTorrent Sync. Having poor documentation can be remedied by having the source code openly available for consultation. As neither the source code or any truely meaningful documentation was available it is felt that this hampered the progress of this project and will continue to hamper future endeavours on the subject. There are many aspect to the API which appear available but are undocumented so they cannot be reliably used or trusted.
Aother challenge to the progress of this project was that the underlying architecture was in its infancy and, more directly, that there were very few publications on the matter. This may be apparent from low number of directly related texts which have been referenced. The only other academic research that seems to exist at this time has been conducted by a network forensics group from University College Dublin.

5.2 Future Work

There is a huge amount of potential for development in this field. It has the potential to offer incredable scalability with very little initial initial cost to developers. It can also offer enough decentralization that it becomes very difficult to shut services down that run on BitTorrent Sync as there is no central point to destroy.
It must be said for this type of intiative to be a sucess a solution that is more open and free must be developed. It is felt that before future work develops on top of the BitTorrent Sync application that more manpower and time must be put towards creating a viable alternative.
For new technologies to become successful they often need what is called a killer feature. This is a feature that creates initial adoption and eventually future developments that makes the platform sustainable. BitTorrent Sync has no such thing. It might be argued that a killer feature has already been developed in the form of sharing files via BitTorrent. Although BitTorrent is already very popular it remains infamous for allowing users to distribute illegal copies of files. Becuase of the distributed nature of the system the downloads are difficult to stop. BitTorrent Sync needs a killer feature.
A system for naming peers would prove to be very useful if this system became more mainstream. If the ID tag given in Listing 4.2↑ is genuinely unique throughout the entire tracker than it would be possible to associate a more meaningful name to the ID. Perhaps something similar to the current Domain Name System employed throughout the internet today which associates memorable names to IP addresses.

5.3 What Has Been Learnt

It is thought that this type of project is a very good way to get into some kind of software development. Due to the nature of the underlying architecture it is also helps to create a sense of the scale that modern web applications need to be able to handle.
Also, because these applications are some of the first build on top of such distributed systems they show what is possible with the technology.

5.4 Summary

As can be seen from Chapter 4↑, it is possible to develop functioning technologies based off of BitTorrent technology; namely BitTorrent Sync. It is believed that having alternatives to client-server technologies that allow users to decide who gets to have access to their data as well as having the ability to scale very well without massive cost overheads. This opportunity will lead the way for many more developers to build massively reaching applications.

References

[1] N. Andrade, M. Mowbray, A. Lima, G. Wagner, M. Ripeanu: “Influences on Cooperation in Bittorrent Communities”, Proceedings of the 2005 ACM SIGCOMM workshop on Economics of peer-to-peer systems, pp. 111-115, 2005. URL http://x86.cs.duke.edu/courses/cps082/fall09/p2p-papers/p111-andrade.pdf.

[2] BitTorrentInc.: BitTorrent Sync User Manual. 2013. URL http://www.bittorrent.com/help/manual/. Accessed[22/05/2014].

[3] BitTorrentInc.: Technology. 2013. URL http://www.bittorrent.com/sync/technology. Accessed[23/05/2014].

[4] Bram Cohen: BitTorrent - a new P2P app. 2001. URL https://groups.yahoo.com/neo/groups/decentralization/conversations/topics/3160. Accessed[29/05/2014].

[5] Bram Cohen: The BitTorrent Protocol Specification. BITTORRENT, 2008. URL http://www.bittorrent.org/beps/bep_0003.html.

[6] Bram Cohen: “Incentives build robustness in BitTorrent”, Workshop on Economics of Peer-to-Peer systems, pp. 68-72, 2003. URL http://pdos.csail.mit.edu/6.824-2010/papers/cohen-btecon.pdf.

[7] Dropbox: Dropbox Fact Sheet. 2014. URL https://www.dropbox.com/static/docs/DropboxFactSheet.pdf.

[8] Eytan Adar, Bernardo A. Huberman: Free Riding on Gnutella, 2000. URL http://www.hpl.hp.com/research/idl/papers/gnutella/gnutella.pdf.

[9] J. Farina, M. Scanlon, M Kechadi: “BitTorrent Sync: First Impressions and Digital Forensic Implications”, Digital Investigation, pp. 77-86, 2014. URL http://www.dfrws.org/2014eu/proceedings/DFRWS-EU-2014-10.pdf.

[10] P. Garbacki, A. Iosup, D. Epema, M. Van Steen: “2fast: Collaborative Downloads in P2P Networks”, Peer-to-Peer Computing, 2006. Sixth IEEE International Conference on Peer-to-Peer Computing., pp. 23—30, 2006. URL http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=1698587&url=http.

[11] J. Kurose, K. Ross: Computer networking : A Top-Down Approach (Michael Hirsch, ed.). Pearson, 2013. URL http://www.pdfiles.com/pdf/files/English/Networking/Computer_Networking_A_Top-Down_Approach.pdf.

[12] K. Lai, M. Feldman, I. Stoica, J. Stoica: “Incentives for cooperation in peer-to-peer networks”, Workshop on Economics of Peer-to-Peer Systems, pp. 1243—1248, 2003. URL https://www.gnunet.org/sites/default/files/incentives-for-cooperation-in_0.pdf.

[13] Libby Reinish: Group:SyncReplacement. 2013. URL http://libreplanet.org/wiki/Group:SyncReplacement.

[14] Libby Reinish: High Priority Free Software Projects. 2013. URL http://www.fsf.org/campaigns/priority-projects/priority-projects/highpriorityprojects.

[15] T. Locher, P. Moor, S. Schmid, R. Wattenhofer: “Free riding in BitTorrent is Cheap”, The Fifth Workshop on Hot Topics in Networks, pp. 85—90, 2006. URL http://82.130.102.95/publications/hotnets06.pdf.

[16] A. Loewenstern: DHT protocol. 2008. URL http://www.bittorrent.org/beps/bep0005. Accessed[25/05/2014].

[17] P Machanick: “A distributed systems approach to secure Internet mail”, Computers & Security, pp. 492-499, 2005. URL http://homes.cs.ru.ac.za/philip/Publications/_C_and_S/DistributedMail.pdf.

[18] K. Molin: Measurement and Analysis of the Direct Connect Peer-to-Peer File Sharing Network. 2009. URL https://gupea.ub.gu.se/bitstream/2077/22088/1/gupea_2077_22088_1.pdf.

[19] J. Pouwelse, P. Garbacki, D. Epema, H. Sips: The Bittorrent P2P File-Sharing System: Measurements and Analysis. Springer, 2005. URL http://www.cs.unibo.it/babaoglu/courses/cas04-05/papers/bittorrent.pdf.

[20] J. Pouwelse, P. Garbacki, J. Wang, A. Bakker, J. Yang, A. Iosup, D. Epema, M. Reinders, M. van Steen, H. Sips: “TRIBLER: a social-based peer-to-peer system”, Concurrency and Computation: Practice and Experience, pp. 127-138, 2007. URL http://prlab.tudelft.nl/sites/default/files/CPE-Tribler-2007.pdf.

[21] M. Ripeanu: “Peer-to-Peer Architecture Case Study: Gnutella Network”, Peer-to-Peer Computing, 2001. Proceedings. First International Conference on Peer-to-Peer Computing, pp. 99-100, 2001. URL http://cs-www.cs.yale.edu/homes/arvind/cs425/doc/gnutella-rc.pdf.

[22] A. Tanenbaum, M. van Steen: Distributed Systems - Principles and Paradigms (Toni Holm, ed.). Prentice Hall, 2002. URL https://dcetit.files.wordpress.com/2013/10/ebook-distributed-systems-2nd-edition.pdf.

[23] Tcl/Tk: The Tcl/Tk Developer Exchange. 2014. URL http://www.tcl.tk/software/tcltk/.

[24] A. Tridgell, P. Mackerras: The rsync algorithm, 1996. URL http://rsync.samba.org/tech_report/.

[25] T Wu: “The Secure Remote Password Protocol.”, NDSS Symposium 1998, pp. 97-111, 1998. URL file:///home/kieran/Downloads/wu.pdf.

[26] W. Yang, H. Chen, Z. Zhang, Yafei D., Z. Zhang: “Deployment of a Large-scale Peer-to-Peer Social Network”, WORLDS '04, 2004. URL https://www.usenix.org/legacy/event/worlds04/tech/full_papers/yang/yang_html/.

[27] eBizMBA: Top 15 Most Popular Social Networking Sites. 2014. URL http://www.ebizmba.com/articles/social-networking-websites. Accessed[15/10/2014].

figure phd0227.png