|
Lessons from the Skype meltdown
|
|
12-24-2010, 09:00 AM
Post: #1
|
|||
|
|||
|
Lessons from the Skype meltdown
Just a quick thought. It's important that in any distributed system, complete reliance on unstable individual nodes be mitigated by various means to deal with extraordinary situations. The recent, prolonged Skype breakdown was apparently triggered by mass, software-induced failures of their "supernode" (directory) arrays, which are reportedly just ordinary user computers running Skype, that operate in this mode without the knowledge of their owners, by virtue of being outside firewalls or otherwise being more available to Skype than firewalled systems.
In the IDONS case, I would suggest that associated data for reliability purposes could be cached via well-known, stable, reliable systems (e.g. search engines) that would already be playing significant roles for site discovery under the IDONS model. --Lauren-- Lauren Weinstein [email protected] GCTIP Founder |
|||
|
12-24-2010, 11:06 AM
Post: #2
|
|||
|
|||
RE: Lessons from the Skype meltdown
(12-24-2010 09:00 AM)lauren Wrote: In the IDONS case, I would suggest that associated data for reliability purposes could be cached via well-known, stable, reliable systems (e.g. search engines) that would already be playing significant roles for site discovery under the IDONS model. Having IDONS depend on a search engine introduces a dependency loop that could prove to be problematic. IDONS would be needed to resolve the address of the search engine. - Joe |
|||
|
12-24-2010, 04:56 PM
(This post was last modified: 12-24-2010 04:59 PM by eternaleye.)
Post: #3
|
|||
|
|||
RE: Lessons from the Skype meltdown
(12-24-2010 11:06 AM)Joe Wrote:(12-24-2010 09:00 AM)lauren Wrote: In the IDONS case, I would suggest that associated data for reliability purposes could be cached via well-known, stable, reliable systems (e.g. search engines) that would already be playing significant roles for site discovery under the IDONS model. Also, the question of how much we can trust the search engine becomes an issue. I'd actually recommend that, rather than trying to be 100% reliable and inevitably failing, we simply say "roughly X% of the network would have to fail to lose an address" and leave it at that. One example of how we could do that would be combining fault-tolerant replication and the idea of hashing each entity with series of well-known nonces to distribute entities among widely-spaced DHT peers. In Kademlia, for example, the replication functionality is basically "The N nodes with the smallest distance (cf distance function, in Kademlia this is the xor of the nodes' addresses) replicate all keys closer to that node than they are" - This means that if the responsible node fails, the next closest node to the object's hash, which according to the algorithm becomes the new responsible node, *already* has the data, so there is no 'unavailable window'. For an in-depth study of how different factors affect performance and reliability, see "Analytical Study on Improving DHT Lookup Performance under Churn" ( http://cis.poly.edu/~dwu/papers/p2p06_dht.pdf ) or any of the (numerous) other studies on DHT reliability under churn. One other thing DHTs usually rely on is that the source of an object will republish it periodically. This may, or may not be a good thing for IDONS - on the one hand, it gives records a limited lifetime of the entity that creates them no longer exists, preventing outdated and incorrect data from existing in perpetuity. On the other hand, it increases the effort needed on the part of the entity publishing the object. |
|||
|
12-25-2010, 12:22 AM
Post: #4
|
|||
|
|||
RE: Lessons from the Skype meltdown
(12-24-2010 04:56 PM)eternaleye Wrote:(12-24-2010 11:06 AM)Joe Wrote:(12-24-2010 09:00 AM)lauren Wrote: In the IDONS case, I would suggest that associated data for reliability purposes could be cached via well-known, stable, reliable systems (e.g. search engines) that would already be playing significant roles for site discovery under the IDONS model. This is much closer to what I had in mind. I had never heard of Kademlia by name so I did a little reading. This is almost exactly what I thought we were going to have to create from the ground up. It's very good news that this already exists. - Joe |
|||
|
12-25-2010, 02:36 AM
Post: #5
|
|||
|
|||
RE: Lessons from the Skype meltdown
(12-25-2010 12:22 AM)Joe Wrote: This is much closer to what I had in mind. I had never heard of Kademlia by name so I did a little reading. This is almost exactly what I thought we were going to have to create from the ground up. It's very good news that this already exists. Yeah, in my opinion Kademlia is exceedingly cool. Not only is it elegantly simple, but eminently tunable as well. The routing algorithm in particular is just plain cool, while the recursive lookup optimization is a neat twist. Though thinking about this level of detail is probably premature, I personally would probably use a 256-bit address space in any DHT we base IDONS on because that enables us to use SHA-256 in the initial run as it's a hash currently believed to be reasonably strong, and then move to the 256-bit variant of SHA-3 when it's chosen, as it will likely be a strong hash for a long time. |
|||
|
12-25-2010, 07:30 AM
Post: #6
|
|||
|
|||
|
RE: Lessons from the Skype meltdown
There must be some sort of higher level fallback mechanisms. This would not be the same thing as creating a hierarchy for routine use, but total reliance on potentially unreliable "consumer" nodes represents a significant risk -- it's one reason why local mesh topologies structured on such a basis for Internet access have been largely unsuccessful. It makes no sense not to leverage the capabilities of known third-party, stable sites (e.g., major search engines and directories) for discovery and fallback lookup purposes. On a conventional P2P network, lookup failures mean a user won't be able to retrieve particular files right then -- a hassle, but usually not a catastrophe. But failure of a basic lookup addressing system could mean the inability to communicate successfully with any sites for affected users via conventional apps. This is not an acceptable risk, and must be mitigated through parallel fallback methodologies.
--Lauren-- P.S. Let's try to keep quoted text to the minimum necessary on replies. Thanks. --LW-- Lauren Weinstein [email protected] GCTIP Founder |
|||
|
12-25-2010, 04:03 PM
Post: #7
|
|||
|
|||
RE: Lessons from the Skype meltdown
(12-25-2010 02:36 AM)eternaleye Wrote: Yeah, in my opinion Kademlia is exceedingly cool. Not only is it elegantly simple, but eminently tunable as well. The routing algorithm in particular is just plain cool, while the recursive lookup optimization is a neat twist. Though thinking about this level of detail is probably premature, I personally would probably use a 256-bit address space in any DHT we base IDONS on because that enables us to use SHA-256 in the initial run as it's a hash currently believed to be reasonably strong, and then move to the 256-bit variant of SHA-3 when it's chosen, as it will likely be a strong hash for a long time. It does look very tunable, and very cool. I think it could be tuned to mitigate any concerns about failure due to inaccessible nodes. If I understood the thinking behind it, the k-buckets would be populated with nodes that are available more than others. Over the long haul this would tend toward a stable network. If we have concerns about availability, we increase the bucket size. I'm not sure hashes would be the way to go though. I was thinking more toward a guid-based identifier. Something like a 192 bit (24 byte) number composed as follows: 128 bit (16 byte) random number 48 bit (6 byte) MAC address 16 bit (2 byte) time The main reason I don't want to use a hash is that they are predictable given the same input. If you and I create the same name, the hash - and thus the IDONS identifier - would be the same. - Joe |
|||
|
12-25-2010, 04:19 PM
(This post was last modified: 12-25-2010 04:25 PM by eternaleye.)
Post: #8
|
|||
|
|||
RE: Lessons from the Skype meltdown
(12-25-2010 04:03 PM)Joe Wrote: It does look very tunable, and very cool. I think it could be tuned to mitigate any concerns about failure due to inaccessible nodes. If I understood the thinking behind it, the k-buckets would be populated with nodes that are available more than others. Over the long haul this would tend toward a stable network. If we have concerns about availability, we increase the bucket size. Well, I never intended the identifier to be a hashed name - rather, I figured it would be the hashed fingerprint of a public key along with a series of nonces (to distribute it among the peers). Also, what if somebody changes their network card? Suddenly, their MAC address and identifier change, but it really shouldn't have. I'm thinking of identifiers as identifying 'who' more than 'what', so I'm very reluctant to tie them to hardware. Also, if you include an explicit 'random number', you end up with malicious peers who choose a number specifically. Also, if the time is a parameter, then the identifiers would be highly unstable - I think part of the point of identifiers is that they can be used in scripts and such for a stable way to refer to an entity. Finally, shared hosting would mean that we'd need a way for multiple identifiers to exist for a single entity, which makes including the mac address somewhat less than ideal. |
|||
|
12-25-2010, 04:24 PM
(This post was last modified: 12-25-2010 04:25 PM by eternaleye.)
Post: #9
|
|||
|
|||
RE: Lessons from the Skype meltdown
(12-25-2010 07:30 AM)lauren Wrote: It makes no sense not to leverage the capabilities of known third-party, stable sites (e.g., major search engines and directories) for discovery and fallback lookup purposes. On a conventional P2P network, lookup failures mean a user won't be able to retrieve particular files right then -- a hassle, but usually not a catastrophe. But failure of a basic lookup addressing system could mean the inability to communicate successfully with any sites for affected users via conventional apps. This is not an acceptable risk, and must be mitigated through parallel fallback methodologies. But the failure mode of a DHT with bad or missing nodes *isn't* an inability to resolve anything - it's that a few specific addresses will be unresolvable, and even that will only last until they republish, which in a DHT they do periodically anyway. Also, the question of "How much do we trust this search engine?" still hasn't been answered. If we use a search engine as a fallback, we must trust them completely - because if the DHT fails completely (very unlikely, but potentially possible if someone maliciously aims to do so) they become the single trusted entity doing all the lookups - exactly what IDONS is intended to avoid. |
|||
|
12-25-2010, 04:42 PM
Post: #10
|
|||
|
|||
RE: Lessons from the Skype meltdown
(12-25-2010 07:30 AM)lauren Wrote: There must be some sort of higher level fallback mechanisms. This would not be the same thing as creating a hierarchy for routine use, but total reliance on potentially unreliable "consumer" nodes represents a significant risk -- it's one reason why local mesh topologies structured on such a basis for Internet access have been largely unsuccessful. It makes no sense not to leverage the capabilities of known third-party, stable sites (e.g., major search engines and directories) for discovery and fallback lookup purposes. On a conventional P2P network, lookup failures mean a user won't be able to retrieve particular files right then -- a hassle, but usually not a catastrophe. But failure of a basic lookup addressing system could mean the inability to communicate successfully with any sites for affected users via conventional apps. This is not an acceptable risk, and must be mitigated through parallel fallback methodologies. I have to disagree. A lower-level system (IDONS) cannot reliably fall back on a higher-level system (search engine) which relies on the health of the lower-level system. How could we expect to resolve the address of the search engine? I'll go ahead and say that we should steer far clear of any system that proposes we use hard coded addresses in case of a failure. In addition, falling back to a less robust system (at least from a security perspective) may prove to be the vulnerability that allows the system to be compromised. I don't believe that equating Kademlia to a generic local mesh topology is accurate. Kademlia can be tuned to provide a very robust topology. That said, we should of course be constantly vigilant to guard against availability problems for such a crucial system. And if we suspect any problems, we can adjust the system to cache additional node locations both locally and globally. - Joe |
|||
|
12-25-2010, 05:25 PM
Post: #11
|
|||
|
|||
|
RE: Lessons from the Skype meltdown
I have no problem with the concept of a relatively few key hard-coded addresses -- which can be updated automatically from time to time -- as part of the fallback mechanisms. There is no need to trust search engines or directories any more than we do now. The addressing data in question can be signed when presented by a search engine (or directory) for authentication purposes.
In a practical IDONS system, search engines/directories will be playing an even larger "discovery" role than they do now. Not using their capabilities to the fullest in the furtherance of multiple reliability layers would make no sense. There is no "P2P purity test" involved in IDONS technology. How "neat" or "cool" any particular P2P system might be is interesting and worth study -- but little more at this stage. --Lauren-- Lauren Weinstein [email protected] GCTIP Founder |
|||
|
12-26-2010, 12:44 AM
Post: #12
|
|||
|
|||
RE: Lessons from the Skype meltdown
(12-25-2010 05:25 PM)lauren Wrote: I have no problem with the concept of a relatively few key hard-coded addresses -- which can be updated automatically from time to time -- as part of the fallback mechanisms. There is no need to trust search engines or directories any more than we do now. The addressing data in question can be signed when presented by a search engine (or directory) for authentication purposes. If we use hard-coded addresses at all, they should be to stable IDONS servers - not to search engine servers. Search companies have shown their willingness to cooperate with various corporate and governmental entities to an extent far beyond what is required by law. It is entirely rational to assume that those that haven't behaved in such a manner yet, simply haven't - yet. To use a search engine as a fallback violates the "No centralized control" requirement because at the IDONS server's most vulnerable point in time, we would have it pulling critical data from a minimally trusted outside system. In addition, it is likely that getting the IDONS information into or out of the search engine will violate the "All communications encrypted" requirement. Yes, we could make sure the data is signed by an IDONS server. But then again we could do that with another stable IDONS server rather than going outside the system. I just don't see a good reason why we should assume that a hard-coded IP address of a search engine would be more available than a stable IDONS server's IP address. Even if we do make that assumption, why should we also assume that it is so much more available than the IDONS server that it is worth exposing IDONS to the risk of having to trust the external system? (12-25-2010 05:25 PM)lauren Wrote: In a practical IDONS system, search engines/directories will be playing an even larger "discovery" role than they do now. Not using their capabilities to the fullest in the furtherance of multiple reliability layers would make no sense. To my knowledge, search engines/directories currently play absolutely no role in name resolution. It makes no sense to give them that role given that they are owned and controlled by entities that have tendencies, incentives, legal obligations, requirements, etc. to behave in ways that do not put IDONS as their top priority. (12-25-2010 05:25 PM)lauren Wrote: There is no "P2P purity test" involved in IDONS technology. How "neat" or "cool" any particular P2P system might be is interesting and worth study -- but little more at this stage. I'm not clear what you mean by a "P2P purity test". I'm not attached to any particular P2P system. However, this certainly seems to lend itself to a P2P solution. And just because it is "neat" or "cool" doesn't mean we should summarily reject it. P2P systems are definitely interesting and worth study. And given the requirements you stated for this project (Fully distributed, No centralized control, Fault tolerant, etc.), a P2P system seems to be exactly what we should use. We're clearly disagreeing on this point. Perhaps I'm just not "getting it" yet. What am I missing? - Joe |
|||
|
12-26-2010, 01:37 AM
(This post was last modified: 12-26-2010 01:39 AM by eternaleye.)
Post: #13
|
|||
|
|||
RE: Lessons from the Skype meltdown
(12-26-2010 12:44 AM)Joe Wrote:(12-25-2010 05:25 PM)lauren Wrote: In a practical IDONS system, search engines/directories will be playing an even larger "discovery" role than they do now. Not using their capabilities to the fullest in the furtherance of multiple reliability layers would make no sense. I think it's important to make clear that there are *two* things being proposed for the search engines to do. One is "discovery" enablement. This comes down to having the search engines index IDONS entities. This requires nothing from IDONS itself beside some search engine thinking it's important enough to index, and there being a way for the search engine to resolve IDONS addresses. Thus, it really needs very little discussion due to it being exactly the same for IDONS as it is for DNS. It all comes down to "Discovery is not the purpose of a naming system. It is out of scope, and should be done as an application, such as a search engine, which spiders the web." The other is as a fallback mechanism for resolution. This makes me profoundly uncomfortable, for several reasons: 1.) It creates a circular dependency. In the future, the ideal situation is one in which DNS is no longer used and IDONS is the sole name resolution system. In this situation, which is the goal, accessing any search engine requires resolving its name via IDONS. If the case 'IDONS is out of commission' is handled by asking a search engine for name data, then we have a dependency loop - we need IDONS, but since it's down, we need to ask a search engine; but we need IDONS to find it! One possible 'solution' is to statically distribute specific IP addresses for the relevant search engine. I quote the word 'solution' because it has too many flaws to be a viable one. a.) Any search engine powerful enough to serve this purpose is run from multiple sites, and a load-balancing system selects the right one for the client. If we distribute static IPs, one site will get the traffic from all IDONS users, possibly overloading it. b.) If the site whose IP address we distributed is down, but other sites are up, lookups will still fail. c.) If the IP address we distribute is out of date, lookups will fail. Maintaining this data imposes a maintenance burden. d.) Updating the distributed IP addresses must be authenticated in some way to prevent a malicious entity from performing a DoS attack (by replacing all IPs with incorrect ones) or another attack (replacing them with malicious IPs). This requires a central authority, which is stated in the goals to be unacceptable. 2.) It is untenable in a security sense. We would have to trust the search engine to preserve these invariants: a.) The data would not be removed. b.) The data would not be altered. c.) Clients would not be denied access (possibly depending on the client!) Lauren, you suggested signing the data. That only addresses B, leaving A and C, which I consider greater threats, unresolved. Altering the data is not the only attack vector. Removing it entirely or denying it to select people/groups are just as dangerous. 3.) It is out-of-scope for the search engines, loads their servers, and doesn't bring them revenue. Why would they even allow it? Let's use Google as an example. They get their money from ads. If we use the search engine as a cache of IDONS information, then any resolution will be downloading that data from their servers in an automated manner, never displaying a single ad. They gain nothing from allowing this; why would they do so? 4.) How, exactly, would we store this data in the search engine? We would need some way of storing small amounts of data, reliably, securely, on an untrusted server that isn't designed to store things, only to index them. I think this is just plain *infeasible*, in addition to being insufficient to meet the requirements of IDONS. |
|||
|
12-26-2010, 06:09 AM
Post: #14
|
|||
|
|||
RE: Lessons from the Skype meltdown
(12-26-2010 01:37 AM)eternaleye Wrote: In the future, the ideal situation is one in which DNS is no longer used and IDONS is the sole name resolution system. What do you mean with "DNS no longer used"? Do you refer to the DNS hierachy of registered names or the DNS name resolution protocol? So far I do not see why the protocol needs to be touched at all. Using the DNS hierarchy however is just a matter of configuration. If I was to edit my /etc/bind/db.root file to make my own server authoritive for .idons I could right away start using it. (Just it would by my IDONS, different from yours. Using the DNS hierarchy would become a fallback option immediately - and could be taken out as soon as my .idons resolution is "sufficiant simillar" to yours to allow us to practically exchange most names of our common interest. To achieve this "sufficiant simillar", we would need configure delegation for 2nd level zones to known communities, which maintain their own namespace. To my understanding that appears to be the topic IDONS would define. (12-26-2010 01:37 AM)eternaleye Wrote: b.) If the site whose IP address we distributed is down, but other sites are up, lookups will still fail. So given that the IDONS peer we are asking runs on a quorum of hosts, each of them would be named as NS records. If one IP is down, the others will answer. (12-26-2010 01:37 AM)eternaleye Wrote: d.) Updating the distributed IP addresses must be authenticated in some way to prevent a malicious entity from performing a DoS attack (by replacing all IPs with incorrect ones) or another attack (replacing them with malicious IPs). This requires a central authority, which is stated in the goals to be unacceptable. I have to object: this requires authority but not central authority. Each community (that is each subset of the worlds peers, which share a part of their respective namespaces in a consistent manner) would need to implement rules (aka. a registration and dispute policy) how authority over their directory is maintained. (Those rules will probably need to be automated. See also here under level 3.) |
|||
|
12-26-2010, 06:18 AM
Post: #15
|
|||
|
|||
RE: Lessons from the Skype meltdown
(12-25-2010 12:22 AM)Joe Wrote: Also, the question of how much we can trust the search engine becomes an issue. I'd actually recommend that, rather than trying to be 100% reliable and inevitably failing, we simply say "roughly X% of the network would have to fail to lose an address" and leave it at that. That's why I propose the idea to run registries at multiple independant sites without central control. See also this posting: http://forums.gctip.org/thread-89-post-291.html#pid291 Example: with byzantine consent, your X% become slightly more that X=30. |
|||
|
« Next Oldest | Next Newest »
|
Search
Portal Page
Help

