MAT Working Group
23rd May 2019
At 4 p.m.:
BRIAN TRAMMELL: Hi, I am Brian, together with NINA we are the co‑chairs of our RIPE MAT Working Group. If you are looking for DNS that's downstairs but we will have probably much more civilised discussion up here.
So, welcome to Iceland. This is Iceland, this is what Iceland looks like. That is photograph I took last weekend out on the peninsula on the west.
We have a pretty full agenda today. So please come on in. I would like to see ‑‑ if you are a speaker who is listed here I would like to see your hands just make sure we have everybody in the room. I saw Jari come in. Perfect.
So, we have kind of a little bit of a grab bag of measurement and measurement related topics today, a couple of measurement studies, I forget the first names ‑‑ Massimo Candela looking at edge measurements and speed measurements in one of the RIPE regions. Then a couple of things on buffers and bandwidth, some few on the available bandwidth estimation problem with some evaluation of that. And then we will go on to at least everyone's favourite topic in the IETF encryption in the transport layer and Jari Arkko will be talking, only for 15 minutes so we are not going to solve it today. We will have the cuss me RIPE NCC tools update.
And then the scribe for remote is Mikala, and we have a transcription for the minutes. So let's run a few minutes easterliy because we have a packed schedule. Danilo Giordano, can you come on up.
DANILO GIORDANO: Thank you very much. My name is I am assistant professor and before to start I would like to thank RIPE to let me be here to listen, many interesting topic and also being able to give this evenings. This is a work I did at my home university, the Polytechnic Turin, which is devoted to data analysis, in particular in this one network data analysis about the study of the last five years of evolution of the Internet network from the perspective of an ISP, in particular from watching Internet from the ISP network.
So from the network of an ISP, which is located in Italy. First, the reason why we decide to do that, not because the network is something easy to understand, not because we have data but because it's very important since someone stated that the network is the first thing that you might ‑‑ has built that ‑ does not understand. This is the largest experiment in humanity that we have ever had. This is the ex chairman of Google. So if someone big like Google state this thing we cannot understand the Internet, how normal human beings than us who have a way smaller view of the network my understand.
We do it by measuring and try to understand several aspects. The first one, posing several questions is understanding how much a user cost for the ISP in term of how much consumption of data it does.
Secondly, how the habits changed in term of the service that the user access. Then what are the protocol using the network and how they changed over time.
And finally the infrastructure.
Before to start looking in the result, we have to understand how to make these measurements, thousand have the data. So, we have a network of the ISP and in this network we have two families of households so 5 K related to households connected with a fiber channel. So fast network. And the second one, 10,000, with ADSL ‑‑ so bit slower in upload. They are connected with the Internet their ISP and we ‑‑ the border of the ISP so two point of presence with the passive probe called T‑Stat which is able to monitor the traffic from the user to the network and backward. In this way, we monitored TCP flows for each connection we monitor a flow and in five years, so up to the end of 2017 we recorded 250 billions flows, so quite a lot, which is about 32 terabyte of compressed data.
Then, we used T‑Stat to monitor these flows and we collect several statistics about these flows, in particular we have information such as the client IP for who is privacy aware don't worry, this is anonymized in our data to avoid any privacy leakage. We have server IP who served the flow, which port they used, how much traffic in terms of flows and bytes or packets which was the protocol used etc..
Then, we announced the flows with our information, collected with a plug in called ‑ D N hunter which allows us to know which was the service requested by the client in particular by using the DNS conversation or the information inside the certificate, so the TLS, SN I information. So, for instance we may have the information that a flow was related to Facebook.com. However has John Postel said, the name indicates what we see, but here, Facebook.com, yes, it's a name but not completably understandable. For instance in the instead of using Facebook.com you were facing Facebookstatic.AcmeIHD ‑‑ into the actual service name, so translating everything related to Facebook over the years as Facebook, etc. We did that for several services, the most important one in order to tag the flow we actually the service name like Facebook.
Then, we processed by using spark, all these data into our data centre, in order to aggregate the information in different aggregation and run several analytics to answer those questions that I posed in the beginning.
The first one was related to how much the user cost for the ISP in term of traffic consumption. To answer that, in this slide we have on the right the complementary distribution faction of how much each user downloaded in 2014 for ADSL customer. So, you can see that they downloaded less than 10 gig bite per day each user. On the right ‑‑ left instead we have the up load, so you see that they uploaded about one gigabit per day.
If you see the situation in 2017, you see that for FTTh customer, for ADSL customer they double, they download and also they upload.
If you look at the difference with they have a faster connection in download and especially in upload you tend customer tend to download 25% more data but actually double the up load due to the faster connection available.
Very interesting is to look at the tail of the distribution, in particular for upload where you see that there has been a change in the behaviour, indeed every upload decreased because of the drop of peer to peer, for instance ‑‑ as well.
Next, we decide to study what was the user behaviour five years ago and now. And here we have the percentile of active client that actually visited each service and in particular pay attention you have to count the visit, not when you contact service because for instance when you contact Facebook, you may contact Facebook because of ‑‑ many websites.
So here, you see YouTube which rise from 30 to 50%, then we went to most search engine service like Google, then binge which rise from 15 up to 45% because of telemetry, duck duck was not used and unfortunately still now.
Then, we decided to move to popular application, social media application like WhatsApp and Instagram. I am sure almost everyone here are have in smartphone, and about the popularity, indeed what we saw is that at home user connecting with their wi‑fi connection about 60% of them nowadays used Instagram either with SDSL or FTTh connection. And actually the traffic consumption rise from a few megabyte up to more than more than 10 per day, this is for messaging so should not be that high. And in particular, interesting are those Christmas and new year when everybody exchange wishes with everybody else. Looking at Instagram, you see the results did increase in the usage but more dramatically, is the consumption per day because this is the average consumption per day that was less than 50 megabyte until middle 2016 and then quickly rised up to more than 150 megabyte per day. So, before there were a few customer using Instagram generating not too much traffic, now there are much more generating much more traffic, three times more which were what we were used to two years ago. This shows WhatsApp and Instagram also frequently used at home at wi‑fi connection.
We move to the last social network application, starting the traffic consumption per day focusing in particular in 2014 because during after 2014 Facebook decide to enable auto play for their video so whenever you watched Facebook page the video start playing automatically. You see that there is a rise, then a drop because they decide to stop for a short time the auto play, start again in the summer when the traffic more than doubled per user.
Finally we also moved to study the protocol. And we see how the traffic shared per each protocol in the beginning was mainly http or TLS, then we see that Google decide to move from HTP to a TLS, Google start using QUIC, then speed SP D Y. Disabled removed QUIC for a short time and finally Facebook zero appear. You can see how they actually decide to implement their protocol and using the network to play with them. Moving most of the traffic from http to other protocols. And this can be a problem for instance for network management that has to block something and now they have to design new rules related to new protocol. The last aspect about the infrastructure, we decide to monitor the CDNs, since CDNs should bring content closer to the user, and what we have is the probe T‑Stat located between the customer and server and we can monitor the time between the probe and the server as the core network delay to the roundtrip time in order to get the content. And what we have also is the access network delay which is the delay from the customer to the probe, but this delay should be negligable with respect to other one because the probe is at the end ‑‑ edge of the point of presence. So by monitoring the core network delay we monitoring the delayer two services, first Facebook in 2014, you see that about 40% experienced a roundtrip time of 3 milliseconds and 80% at 20 milliseconds.
You see that for ‑‑ you do this much better 80% is 3 millisecond and about 40 percent less than half millisecond.
So we are actually breaking the millisecond access.
So, in conclusion, you see how the traffic almost doubled and the infrastructure keep concentrating specialising since the network it moving closer to the user. Thank you.
BRIAN TRAMMELL: So we have a couple minutes for questions. Two short or one long. Or zero. Okay. Thank you very much. And next up is Massimo.
MASSIMO CANDELA: So, anyway, from NTT Communications, good evening everybody.
So, the presentation of today is about some number crunching we did for what we believe is an anomaly. The idea is to put some light on this and hope that we can improve it.
The work and the presentation doesn't necessarily represent the point of view of NTT communications but I am happy they are open to whatever collaboration and work we want to do and they support us sending us around so thank you very much.
And so, I am talking about the speed of Internet. So this term speed of Internet has been used for at least 20 years in literature and is not a bit rate, is not how much time it takes to open a page. It is actually real speed, expressed in kilometres per mile seconds. And it essentially how much space, how much wire your packet is going to travel in unit of time that is the millisecond. And the really, you really need that when for example you are doing active geolocation. Is you have an IP address and you would like to know where that is, you use some probes or sources around the world to do some latency measurement towards that target and a the some point you need to convert that latency measurement to space or to kilometres and you use the speed of Internet.
So, in general, the speed of Internet that is used is a fraction of the speed of light like two‑thirds the speed of light or four ninths of the speed of light that should resemble the proposition on the optics on the fiber. Which is correct, but is absolutely an upper bound. So you can prove that if you do like measurement and we used RIPE Atlas, we did 23 million ping measurement and we ‑‑ well knowing exactly where both the end points were. You can clearly see that the reality of the speed of Internet not only is lower than that but is also really dependent on the regions where you are.
So, why is depending on the region? This is a clear sample of what is going on. When you try to calculate the speed of light, you calculate the distance based on the surface of the earth. But you don't see what actually the wire is doing under, in this case the car is going to take much more time to reach the end point compared to what the dash line says. And each region has different level and securityness and in particular low number for the speed of light is a big hint of level of securityness. So, in particular, this research started when we were trying to geolocate stuff in Middle East and you can basically get anything out of it. We were using RIPE Atlas but it can be other factor we clearly see it's really do I have geolocate anything with reasonable accuracy so I put an image of a blackhole here but we recently discovered that this is how a blackhole looks like so I have to update my presentation.
But if you want something more, slightly more scientific let's say, we can go with a scatter plot of distance over milliseconds and I am comparing here Europe and Middle East and going to tell you later why them. The blue one are Europe and we see the cluster is compace and you can perceive the speed of Internet line where it goes, instead the red one are all scattered all over the place and clearly you need more milliseconds for doing the same amount of kilometres.
So now what we did, we moved this ping measurement and we used the same all trace route measurement. We collected only the one that they reach the real destination and only the one from departing and ending in Middle East or Europe and we are comparing this these two because they have similar average distance rates 1200 and 1300 kilometres which is a factor impacts speed so you would like to have similar comparisons. We already see roundtrip time of 150 against 53 but what is really important is this value, the speed of Internet which is 27 for Middle East and 66 for Europe. So now this presentation is about Middle East because by far Middle East, we divided the world and by far Middle East was an outlier so we repeated the test multiple times which we were thinking something was wrong but we tried multiple times and this is much, much lower than all the other ones.
At some point we discovered 60%, one of the reason is because 60% of these pathway actually go outside of the region. So even if start and end point is in the region, they go outside of the region, while only 5% happens for Europe, only, and we have a roundtrip time that goes up to 80 milliseconds compared to 63. So we have a big chunk of this trace route, they actually go somewhere else really far.
We will see where. And if you look at the hop count we have 18 compared to almost 14 which are both cases going out of the region so they are both not good. But we have around the same inAS path length.
So we divided in subset and we took the measurement all coming from different countries and so basically same region but different countries, A to B, and and B are in different countries of the same region. 80% of them they go outside the region so they are not even passing through Middle East. And also in this case roundtrip time is pretty high 80 milliseconds and we have got amount of 30% in particular of neighbouring countries so country that they actually share a border, they go out of the region, not only out of the countries out of the region.
If we do the same by using countries that are the same countries so source and destination in the same country we discover instead the situation is not bad at all. So of 8700 were the successful one with the probes blah‑blah‑blah, we got that almost 8200 stayed local not only in the region but in the country. And I mean the number of other regions, you see the number of not staying local in the country and out of region is similar.
So this means that if you cross the border of the country, it just goes out of Middle East. And but where it goes. We did this bar‑chart, this is the top ten of the countries where you go out of the region and we clearly see that they are strongly European countries plus, for example, US. And there is a strong German presence here. And if you instead look because we also looked a bit of internet exchange point, this is a summary of the, and we saw only 1.9% of this path they cross an IXP and when they do, this is where the IXPs are placed and we try to check again this numbers, also by geolocating the hop before and after the peering LAN and we can confirm these numbers. Actually 18.9 go to an IXP and in general these are anyway outside of Middle East, European IXPs.
So now to conclude: If you want to look for all data because this is just a short presentation about it, you can download ‑‑ find it online with the title of the presentation. And the summary is that countries in Middle East don't talk to each other, this may reflect some social and political situation. And even if they share a border, 30% of them anyway they don't communicate to each other and this impacts with roundtrip time also the peer to peer traffic also may impact. I know that the user experience is different because probably users they go to content provider which are optimised in a different way and we didn't analysed that but peer to peer traffic is important and most of all I cannot do geo location and I really care about that.
This is not such a novelty, I mean there are similar research, I want to characterise this all all datas and put some ‑‑ basically nobody talk about Middle East in particular before. And I think that, I mean it could be incentive to peer and peer with neighbours, with anyway peer close. And so basically that's all my presentation and I want just to thank my co‑authors, from University of Pisa and the National Research Council of Italy, they help me with this presentation. And now it's question for real.
AUDIENCE SPEAKER: Dmitry from Adobe. First of all, someone should fund independent IXP because if they don't talk to each other maybe they talk to the independent guy. But the real question is how much of the difference in speed you found, do you think could be because of the lawful in those countries intercept or other devices like that?
MASSIMO CANDELA: That is a good question and we did an experiment about that. I would say it's a really basic experiment where we tried to understand if there were like points where the traffic goes most ‑‑ where the path goes most of the time. We didn't find anything remarkable so I believe it may be but I don't have results from the short experiment we did we didn't find any I would say, any point of interception, it was clear to us from the outside. So if you are aware of something more, we can talk about it but...
AUDIENCE SPEAKER: Because I don't think they have spools of fibers lying around in the street just because, just to have a slower speeds, so ‑‑ so those ‑‑ those milliseconds are happening somewhere.
MASSIMO CANDELA: Yes, if you consider there is a higher number of HOPs and also the fact that there are, I mean, trace route that they go from Middle East to, well just already Germany but they go to, I don't know, to the US, I mean for as much as you want to make a fuss you are going to increase, like I have also here like I didn't show this but when it's local it's 55, when it's not it can on average go to 250 so the roundtrip, independently from the ‑‑ maybe there is a path where there is significant traffic or whatever, the ‑‑ introducing a lot of roundtrip time.
AUDIENCE SPEAKER: Only measurement I could ‑‑ is to have probes talking to each other in the same operator or something like this.
MASSIMO CANDELA: Okay. I think from our data set it's pretty extended we could filter out and have numbers about that. The measurements are probably already there.
NINA: I put on my Working Group Working Group chair hat and remind people of one question and keep it brief while you are asking questions and we have the microphone in the back.
AUDIENCE SPEAKER: Janus. A few earlier we heard different routing systems being made in IPv4 versus IPv6 so my question is did you differentiate your speed of Internet measurement via 4 and 6 and what were your results if you did so?
MASSIMO CANDELA: We also in this case, it's still ongoing that because we did a first analysis and but we have a big number of measurement and on average there was nothing significant that we detected but I have to analyse it better.
AUDIENCE SPEAKER: Thank you.
CHAIR: I am NINA, and this is with my day job hat on from Netflix. I find this interesting because it intertwines into the interconnect Working Group because the thing we see here we are confirm that from the operators' side it is what we see and not just a question about oh go peer, because oh go peer in Middle East is not very easy because of the way the market works. So, but it's really good to have this kind of studies and maybe if you could look into more about what the consequences for the Internet quality in the country would be, because that might help persuading the market players in the region that they need to change their ways, so Middle East could have good Internet as well.
MASSIMO CANDELA: Thank you very much for feedback.
AUDIENCE SPEAKER: So my question is: There is some use case that you possibly saw and then have a little headache to try to explain it. For example, let's say you have many probes but the IP that you are trying reach for ‑‑ Anycast IP that probably it's not the same point of termination that is responding each time. Or how can I say? Or how can you be sure that the latency that you see it's not due to a crappy device at the service provider side and not really related to a distance in itself?
MASSIMO CANDELA: So, the first of all I want to, this even if it started as a geolocation effort it's mostly here to show just securityness of the network so the geolocation part I didn't mention it. We used ground route where we know to do the speed of Internet, we exactly know where their points are because RIPE NCC place there and RIPE Atlas probes we know exactly where they are and of course there can be devices in the middle and well even a simple router is going to introduce for sure some delay. But that is part of the data set, they create some outlier, they should not impact 23 million of measurement.
AUDIENCE SPEAKER: Thank you.
CHAIR: Thank you very much.
Next up is Te‑Yuan Huang.
TE‑YUAN HUANG: Hello, cuz I am a short person. So, attending conference can be tiring but it's fun and I don't know about you guys but for me, after tomorrow I am just going doing home, lay down on my couch and play some Netflix.
But from time to time, when you play something, after you click that play, there is still some delay between the click to the video to show up to you, right? And there are so many factors that goes into that delay. But today I am going to draw your attention to one of the factors which is buffer sizing and active Q management, these are two concepts so deep inside the network sometimes people don't draw attention or connection between these to the end user facing metrics and that is why today we are going doing through the journey together and have some fun.
So just to do a quick review, Netflix stream our content through CDNs, the open connect network. The green dots are the deployment directly impacted inside our ISP partners. Here in Iceland we do have impact right there. And the green ‑‑ the Orange dot there is the IX site that we have.
The way people connect to us are through various devices and typical connection is through their home gateways and then connect to our open connect sites. The condition can happen either on uplink or the downlink so we will talk about each cases individually.
Let's talk about the downlink first.
So, to understand how the buffer sizing in our downlink can affect ‑‑ we do the following experiments: We have two stacks, each one connected with two different routers. The up router is smaller, roughly equals to about two milliseconds worth of buffer and the down router is larger, roughly to 40 milliseconds of the buffer. And these two will then connect to RPNI to the ISP partners. Additional ‑‑ we use RACK, new Reno and the traffic is equally distributed across two stacks so the buffer sizing on the router is the only variable in the experiment. It's still worth noting even though the traffic is equally distributed it's not randomised ‑‑ randomly assigned, so it's not in the most rigorous terms not an AP test but still a valid experiment. The observation we have here is fairly preliminary, we would love to spend more time to do ‑‑ to understand the root cause here but we would love to share the results first so that we can raise some awareness from the community.
So, let's look at the network metrics first. I think it's fairly well known that if you have a bigger buffer you will more likely to suffer from the buffer problem and this is exactly what we see here. The red line here is the big buffer stack and the blue line is the small buffer stack so you can see that either if you look at the minute RTT or max RTT there is a significant difference, roughly around 100 to 150 milliseconds of difference. In terms of ‑‑ good put from the host deck, there is no noticeable difference. But in terms of the retransmission rate, it's also as expected that the smaller buffer stacks suffers from a slightly higher retransmission rate, about 1.5% higher.
Now it's like, the ‑‑ transmission rate is higher, what should we do. At the end of the day Netflix will care about end user facing experience thank should be the decision‑making factor here. And that's what we do. So we look at the QOE metrics instead. There are several that we care about, the first metrics here is the rebuffer rate which is how frequent users view experience got interrupted because the download cannot match up with the playback speed.
And here we see that there is a sizable difference between the big buffer stack and the small buffer stack, in fact it's 30% worse in terms of rebuffer rate. And another are thing is play delay. We already know that the roundtrip time on the small buffer stack is shorter but translate to play delay it's about one second worth of difference and that's user sensible delay.
And that has very interesting implication in other parts of the service as well. In particular, on the supplemental playback. If you are not familiar with supplemental playback, this is something that when you enter into Netflix you will see these browsing mode and when your mouse hover over one of those you will see some supplemental playback that we show to give you some more information about that particular title. And that's what we code supplemental playback. This supplemental playback would stop playing if your mouse moved away and as a result the play delay actually have an impact op how many supplemental playback can actually play because there is limited amount of time between users' mouse who hover over the box and then move away. And in fact we are seeing about for the smaller buffer stack because of the shorter play delay, the number of playback we have on the browser player is actually 30% more; and on TV player it's 20% more. That's a huge difference.
And the difference between the platforms is just that UI is different, the control device is different, it's easier to move mouse away from browser than on TV.
Another thing we found interesting is video quality in terms of aggregate is lower on the small buffer stack. We were scratching our head, what is going on. But if we draw out the distribution, things are very interesting here. So the way to read this graph, the X Axis is percentile throughout the distribution, the Y axis is video quality. So to read this figure you need to draw vertical line somewhere in the figure so you fixate yourself in your certain percentile and reap the difference on the Y axis. So what we are seeing here is on low percentile, meaning low end users. The small buffer stack actually has higher video quality. If you look further on to the higher percentile then you see there is a drop. So essentially there is some resource reallocation happening here. When we are shortening the buffer size on the router we are essentially fasten up the feedback group and allowing people ‑‑ allowing the resource to be able to distribute more fairly.
And we think this is a worth trade‑off because the improvement on the lower end is usually more noticeable than the degradation on higher end. That is the downlink case.
On the uplink, because the uplink condition usually happens inside user's home where we don't have control and also their buffer sizing is harder to decide what is the proper size to tune, however there is a wide deployment of Q ‑‑ just not able yet on lot of home user devices and that he is where we are trying to measure how AQN can affect QOE metrics. We don't have control over those devices. And what we did is, we set a link of 15 megabit down and three megabit up. The buffer size on those devices are 500 millisecond down and 500 millisecond/200 millisecond up. And the key thing, variable we are controlling here is just comparing whether we are FIFO versus FQ codel. We have bug ‑‑
So you can see that from the result, if we use the FQ codel the play delay is consistently low. FIFO is consistently worse and if the buffer is larger, the FIFO suffers furthermore.
Another thing other than play delay is how quickly the video quality can ramp up.
On the left‑hand side which is the FIFO one you see the ramp up took about 80 seconds. On the other hand for the FQ codel took us 42 seconds to ramp up and the delay is so much shorter.
So in summary, for the downlink condition we saw that by properly sizing the router buffer size we can see a very sizable and noticeable Q E improvement and these also ‑‑ and because we also saw a mixed result on the network level metrics yet we see net positive result on the Q E metrics we want to draw attention we should not always pay attention to the level metrics but top level Q E metrics as well.
For the upstream condition we did a lab test and showed AQM can also help, that is something we as a community should further think about how we can encourage more AQM enablement in users' home. All right. Thanks.
CHAIR: And we have questions.
AUDIENCE SPEAKER: Awesome thank you, thank you for presenting the topic. I work at LinkedIn and I just wanted to ‑‑ my only comment is that throughout other tests that I have seen in other companies like basically testing small buffers versus big buffers it is very much in line with your conclusions where basically big buffers are not enough to ‑‑ not enough for to invest time in them so just small buffers usually result in better user experience.
TE‑YUAN HUANG: Thank you.
BRIAN TRAMMELL: Google, MAT chair speaking with none of those hats on. Could you talk a bit about how you generate the QOE metrics. I know that's a big question. What is the scale of that number, where does that number come from, is that a subjective user score or generated some other way?
TE‑YUAN HUANG: Those are objective score. We track whenever there is inter option happens in user's device we send out ‑‑ not exactly at that moment but after the fact. And so based on all these local HOP we can calculate all the information that includes rebuffer, media quality etc..
BRIAN TRAMMELL: And you correlate those scores with the network metric as well?
TE‑YUAN HUANG: Yes. Those are on the client side. We also have the same login on the server side and we use unique ID to match that.
BRIAN TRAMMELL: Thank you.
CHAIR: Any more questions? All right. Thank you, Te‑Yuan Huang.
KATARZYNA WASIELEWSKA: Hello everyone. I am Katarzyna Wasielewska, I am glad to be here. I am assistant professor from state University of applied sciences in he will building a in Poland.
And I would like to show you the result of experiment with some existing MAT for ‑‑ in ISP environment in the real environment.
Let's remind what does available bandwidth mean. It is the difference between the capacity of the system and current bandwidth usage. In the other words, we would like to know how much bandwidth we have for our traffic. In the network on the end‑to‑end connection available bandwidth is the link or not with the minimum unused capacity. Available bandwidth is very important metric for network performance estimation and it can be used for different task in computer networks.
There are a lot of different available bandwidth estimation methods and tools but no one is perfect.
There are active and passive measurement methods and I would like to show you the results with application of existing passive measurement based method. This derives from available bandwidth is described by service curve. In order to estimate available bandwidth by service curve we need to use the operator from the network calculus technique. This method needs to capture incoming traffic to the system and outcoming traffic. And we have to create departure curves for this method and as you see, I show how available departure functions are defined, it is the sum of bits incoming to the system from ‑‑ and departure is similarly the sum of bits ‑‑ outcoming from the system.
And in according to the publication, which I mention below, service curve calculated by this method is the best possible estimate of actual service curve which describes available bandwidth, that can be justified from measurements of available and departure functions.
And you can see the simulation result. We have here available function which shows ‑‑ which shows us traffic with ‑‑ with 4 megabits per second rate and departure curve, outcoming traffic which leave the system with 2 megabits per second rate. And the estimate of available bandwidth as you see on the graph, overlap departure function. Please notice we have no information about capacity of the system. We don't know which interfaces were used, we don't know if maybe it was faster ‑‑ gig ‑‑ it is only extraction of information based on the traffic which is measured and captured. In the incoming interface and outcoming.
But I wanted to check this method in the real network environment. You can see here the part of network topology and interfaces where the traffic was captured. And I had a few routers and wireless and wire links routers, estimate ‑‑ experience that cross traffic, it was very stable wireless connection and the methodology is follows:
We have to capture Internet traffic on the selected interfaces, generate time series for arrival and departure functions. It means that we have cumulative and non‑decreasing functions always. And the last step is to calculate Values of Service Curve by VLFV method. On this example you can see the traffic of one customers in the network. By red colour I marked total traffic which was captured during the five minutes but please look on the part of this graph traffic ‑‑ traffic graph from 100 seconds, from this place.
I must tell you that the customer traffic was simulated for download to six, four, megabits per second but in this estimation where I estimated available bandwidth for selected streams, marked by green colour, on the picture, I ‑‑ it was 1, 7 megabits per second rate maximum of rate and you can see this rate also on the picture below because I show you arrival and departure curves.
And the estimation of available bandwidth is on the picture, this is service curve marked by red colour and we can see that this is estimation only and based on the captured traffic and in the beginning when we have low speed, the estimation show us that we could have faster traffic, it means that we have more place for the traffic in the system, but we ‑‑ the traffic doesn't use it it.
And as you see, we have in this example five megabits per second difference but this method doesn't capture it.
I divided those ‑‑ that period to two parts and you can see the first 100 seconds on the left side and on the right side where you see the last 100 seconds of estimation and as you see, we have similar results on the right side estimation by service curve is close to the departure curve and on the left side we see convex arrival and departure functions and concave service curve and it means that we have estimation which can be described as possibility of bandwidth usage in the system because in the beginning the traffic could use more bandwidth than it does.
I made a lot of different experiments with differentiation and timescale and I would like to show you one example. We have the traffic graph for one customer marked by a blue colour and I estimated a available bandwidth for two flows, for flow A, this is quite regular connection and rate, and flow B when the traffic is appeared in the system from time to time and you can see the additional statistics.
And below you see the estimation of available bandwidth for a flow A and flow B. It is quite regular function for flow A but B we see this estimation shows sometimes we have no available bandwidth for this traffic. This is the throw back, I think, this method. And I estimated ‑‑ I have done tests also for shorter periods for each minutes from the five minutes in the preview experiments, and also for each second and you can see the results of those experiments.
And obviously we can capture the data shift during the estimation. And well, available bandwidth estimation can be a big challenge. Still we have questions, for example how much bandwidth we have between nodes in different ‑‑ located in different ISP networks or how much bandwidth we have between home user to selected node in the Internet. And I think that after presentation I should ask you about the opinions on this subject, what do you think about it, I think. Should I ask you. Thank you very much.
CHAIR: Any questions? No. Okay. We will have the next speaker.
JARI ARKKO: Together with my colleague Marcus and Brian Trammell we want to come and talk about measurements in the era of encryption. Internet traffic has been more encrypted and I guess in the first total that comes to mind, maybe we are done with measurements directly at least passive measurements, because everything is encrypted we can't see anything and the meeting is coming to close and do we need to meet again? Maybe we still need to discuss some things and do some measurements.
We wanted to talk about one particular case, the upcoming new transport protocol QUIC that has also measurement related feature and talk about what can be done with that.
With that useful example of other cases as well.
Just to set the stage and make sure we are all on the same page, let's go very briefly over what QUIC actually is and what it provides. So obviously it provides better security, mostly through encrypting additional data, not just the payload like in TCP and TLS but also the control information. It's also quite deployable because typically implemented in user space and change the browser every night, whereas changing everybody's kernels in the network is going to take a while. Deployable in UDP.
From a protocol stack perspective, if you had running on top of TLS and TCP you would have thinner http running on top of thicker transport layer functionality inside QUIC and then that whole thing runs on top of UDP and inside QUIC you would handle many different functions, multiplexing, transport layer security and so forth.
QUIC is also quite extensible as protocols go and it's supposed to avoid ossification, so the network like in the case of TCP has been looking at the packets quite a bit and it's difficult to change the protocol definitions or what end points to it because so many other parties are involved in the transaction in practical sense.
It also provides reduced latency, if you do instead of multiple surprisingly things go faster, the most extreme case you can actually send data on the first packet which is quite remarkable. Avoids head of the line blocking.
With that, it does present some problems for measurements, the primary problem being since you can't see the information that is being exchanged even the control information you don't really know what's going on so you can't see like RTT information because you don't know how one packet is related to another in the other direction, you can't see the packet numbers, or acknowledgements or sequence numbers as they were in TCP and you can't see any other things like what actions you are taking perhaps of retransmissions and congestions. Where the traffic goes. Or maybe not.
So enter the QUIC SPIN bit. So the IETF has talked about this topic quite a bit and they sort of explicitly decide to provide three pieces of information for on path elements, simplifying a little bit but this is what is the data packets at least and this is explicit decision that we will provide these pieces of information, not by we will just provide some information I hope that nobody reads it. Didn't really work for TCP at least. Those three pieces are the IP header that is needed because you still need to forward packets for instance. And you need connection IDs, the session identifiers in QUIC and they will be needed by the receivers and they might also be needed by things like load balancers so those are ‑‑ and the SPIN bit which is a mechanism for passive latency measurement, on the picture you see the SPIN bit, S bit is the third bit in each packet.
And the reasons for doing this latency measurements could be multiple, could be just troubleshooting trying to understand what is going on, could be congestion detection for some purposes that you can compare what's happening with the trend of latency or RTT and decide that this is caused by congestion, perhaps. We could do quality monitoring in other ways, you could even imagine firing up some alarms or waking up people if things go ‑‑ parameters go out of the prescribed limits. Or you could do research. So that's all fine.
How does SPIN bit work. Keep sending packets with zero rows, they reach the server, that is just down in this case, it's copying, to its next auto packet, whatever value it has received last and those packets go through the network and eventually reach the client and when the first such packet comes to the client it flips the bit and starts sending ones and you can figure it out, if there is an observer on the path somewhere, the observer can look at these big values and determine what might be going on.
So you can do some measurements, so the highest level this sort of provides something similar to TCP acknowledgements and sequence numbers. It's not exactly the same thing. For instance, because you have results that are affected by loss and ‑‑ don't understand what these individuals bits mean, could be in different order but still filter out, you can do that relatively easily.
And you end up with something that is relatively close to what the end points would experience with their perfect information as to the actual RTT. So here is a picture of, on the lower side of it is a measurement by the client and a server and then in this particular example we introduced some artificial packet loss and doing some measurements with spin bits on the upper picture the spin bit derived latency values as shown and we could go into details of this but even from this picture you will be able to see there is some correlation between the lower and upper parts. So you know while you don't get exactly the right values perhaps because of some of these complications you do get some kind of picture at least.
And implementations, obviously no research ‑‑ in the end very useful if it doesn't get turned into some real world practice and equipment and implementations, one of the observers does many, many quick implementations and many of them do the spin bit part which is optional for privacy reasons and even though the ‑‑ always will leave some fraction of traffic not marked, just to sort of hide possibly nodes that want to do this on purpose, not to reveal this information.
There is also in‑network tool that analysing the information called spin dump we we started at Ericsson with Marcus. It's a tool that supports measurement for QUIC with spin bit and without and but you can look at ‑‑ bunch of other protocols. ICMP, DNS, CO AP and provide measurement either aggregate or individual connections send these to user screen or some central collection point.
So, you know, piece of software maybe for you to try out and take for a spin. And it's Open Source so it easily contributed and easy to try, this is basically where we are at and have this technology and some software that does this thing on different network nodes and sort of at a stage where we could do even more interesting measurements perhaps than we were able to do in the past.
So with that, I will leave it up for questions or other comments that people might have.
BRIAN TRAMMELL: Stay up there. I can't let you down that quick. Brian Trammell, no hats. Can you talk a bit about the implementation status of the spin bit. You were in London, I think, at the Interop before this?
JARI ARKKO: It was inconveniently scheduled on top of the RIPE meeting, so I spent first part of the week there and second part here. I wished I had remember that exact numbers but there is like maybe 15 different QUIC implementations altogether in different types like measure, manufacturers, browser vendors and smaller research implementations and so forth. And maybe a handful of them, like five have spin bit and a few more have promised to have support for that, and in these sets with spin bit does include Microsoft and Apple and so on. So, I am optimistic that we have at least enough to make measurements, even if not everybody will do this and as noted, for privacy reasons, this isn't like very sensitive information like you could derive it also in other ways but it's good to be able to not set these things and therefore implementations will not always do this anyway.
BRIAN TRAMMELL: For the people in the room who haven't followed this, I gave a talk about this at a lightning talk two years ago, who is doing interesting stuff with RTT measurements and there was a discussion about how safe it is to expose roundtrip times, so the outcome we have a signal if you look at the ‑‑ at how the spin bit goes around, if one side decides not to participate then it ensures no RTT measurement can be made and that it's easy to determine that none can be made, it's a thing where both sides have to cooperate in order to do that and that was the outcome of that two year long discussion and the IETF was this the way to make measuring RTT on the Internet safe and I ‑‑
JARI ARKKO: Two years per one bit.
BRIAN TRAMMELL: Exactly. Which honestly for the IETF that's not bad. Like I said, no hats. Thank you.
JARI ARKKO: Thank you.
BRIAN TRAMMELL: So Chris.
CHRISTOPHER AMIN: Good afternoon.
BRIAN TRAMMELL: We are okay on time so we you don't have to hurry up as much we told you you did.
CHRISTOPHER AMIN: I am Chris, I am going to present quick updates from, on behalf of my colleagues in the NCC in the field of measurement analysis and tools.
First, RIPE Atlas. So, VM anchors, they exist, we have them. And there is around 100 of them. You can apply to host your own VM anchor. The link is there. We have Amazon Cloud anchors in cooperation with Amazon and we are in discussion with Google as well so we should have presence there.
So the measurement UI on RIPE Atlas has been revamped. One of the biggest changes is we ‑‑ the measurement UI is fully backed by the measurement API, which is a good way for us to use our own ‑‑ for the APIs. Behind the scenes this is allowing us to do some scapability improvements with the API which to start with probably end users won't notice but it will mean that we will be able to speed up the APIs and add in new features shortly.
So software probes. They do not exist but they will exist, rather soon. So the key thing here is, we want to collaborate with various people to help us better test software probes over the summer. So, we are working right now on new infrastructure, so the actual API, and long‑term data storage will be exactly the same between hardware probes and software probes. But the machines that the software probes talk to will be different, they will be dedicated.
So, primarily what we want is people to say hey, I am good at packaging for this operating system, this distribution can put together docket containers, can package virtual machine images and pretty much anything. Because we are not going to package the software for every operating system. There is a survey, so if you have any expertise in this area, please fill in the survey. If you are interested in deploying software probes when they are available in your operating system of choice, please also fill in the survey because that can help us understand how this is actually going to be deployed.
So, a few more smaller things which I guess if I have time we can go through. So EDNS cookie support which was kind of came out of the DNS flag day, Atlas did not support, now does. On the probe stock we currently do not have any RIPE Atlas probes available. So the only stock available now is with RIPE Atlas ambassadors so if you know an ambassador that's the way to get probes. We will be having new stock, we can't guarantee date at the moment but sometime in Q4.
So I wrote a bullet‑point there but basically the software probes is one reason why we are conducting a security review, because there will be many more probes from many more sources talking to our infrastructure. We have had independent security reviews conducted in the past, and we are currently concluding another one just to make sure that we are as secure as we can possibly be.
We are working on RIPE Atlas data being available in Google bit query which I believe it is there in some form for research purposes. I think the idea is in the future we can have more collaboration in this and people can have access to the Atlas data in a big data environment. And we are looking at CDN support to make RIPE Atlas interface, the website, work better, if you don't happen to be located near Amsterdam.
So RIR cooperation, there is a new talk net ox.apnic so this was developed in collaboration between the NCC and APNIC. Powered by RIPE Stat so it has the same functionality as RIPE Stat or a subset of the functionality. And this, you can expect more forms of collaboration like this in the future with other RIRs.
So there is a proof of concept for a new RIPE Stat UI, so it's not the official UI at the moment but you can already check it out, you can already use it. It's very nice. It works on, works very well on mobile devices, the link is there and it explains the new system.
So a couple new features. One is improved RPKI support so within the widgets we display the RPKI validation status with a little happy face hopefully. And RPKI history support so you can see uptake of RPKI over time on a nice graph.
Operationally, as always with RIPE Stat it's up and to the right, so we are most recently seeing around 70 million requests per day to the RIPE Atlas data API. This has been made possible due to improvements on the scalability so there was some scalability issues issues they have been solved to allow us reach these levels and keep on growing in the future.
And there is the RIPE Stat feature tracker, upload, we have had positive feedback from that and on that, and so you can go there, you can request features and they can be picked up by the RIPE Stat team to be worked on.
So a couple of other projects. So RIS live, this is a live JSON based WebSocket based stream from the RIPE NCC RIS root collectors so you can listen to data coming out of the root collectors in near realtime, so these are BGP update messages you can filter by autonomous system and by prefix. Go and play with it and see what it's like. There is a survey for this as well and we are really looking for feedback from people.
We will have a kind of a decision at the end of Q3 on if we keep this as a service for the NCC permanently and in what form.
So the NCC collaborates with researchers and external research in general in various ways. We have a lot of data, we provide that data in general publicly but we will also work one‑on‑one with people and with teams to improve the insider knowledge of that data. This particular case is a case of an intern coming to work with us in the NCC, and in this case it was a state‑based trend analysis layer built on top of the RIPE Atlas data so it adds extra use for the Atlas data.
That's pretty much it. If anybody has any questions you can ask me now and you can also ask anyone wearing one of these black straps especially those in this room.
JEN LINKOVA: RIPE Atlas user. We discussed today some people said when I measure using RIPE Atlas I am measuring residential networks which geeks using, right. So I have totally crazy idea regard ‑‑ can we get them on the phones so we can measure mobile networks?
CHRISTOPHER AMIN: In theory I don't see why not. But that's again an area where we would like to work with people who have knowledge of packaging, this Linux software on Android, say, if someone can come up and say.
AUDIENCE SPEAKER: If you are using RIPE Atlas because if you use anchors you are not measuring residential networks. Most are in data centres.
MASSIMO CANDELA: So the RIS stream live is amazing and I know this is a prototype but we already use it internally other like I know others part where they use it is going to be useful for research. So, please keep this as a service because it's amazing to have it with realtimeness and also I would like to push everybody to actually push RIPE on keeping this as a service and do the form and whatever is needed.
CHRISTOPHER AMIN: Thank you.
RANDY BUSH: Arrcus and IIJ. I always support Massimo. We have used the stream in two projects but, Jen, go back to hackathon, I don't know which, there are a collection and we actually did mobile Atlas probe by a disgusting hack but you never asked. So, and I don't remember when, but Vesna will ‑‑ see Vesna.
CHRISTOPHER AMIN: It was equivalent of making a camera phone by sticking camera and phone with cellotape.
RANDY BUSH: It involved Raspberry PI, it was fun, it was a hackathon.
JEN LINKOVA: I don't want to measure gear, lets measure normal people.
RANDY BUSH: I think they average about five‑foot seven.
DANIEL KARRENBERG: RIPE NCC. One remark: Where us geeks really think this measuring the phones thing is a problem of getting the thing on the phone I would like to remind people actually the scaling up of the infrastructure that actually takes in the measurements is the much harder problem and if you guys want that to happen I think you have, you are at the same time RIPE NCC member, you will have to go and say please do this because it's going to cost money. And whether we do it sort of in the Cloud based back end infrastructure or whether we, if we do it the way we are doing it right now it's going to cost money and it's going to cost money ‑‑ I am babbling. The question is about the software probes. You said there is a Beta test, is part of this actually also aiming at co‑locating software probes with existing hardware probes and see and analyse the differences and the results?
CHRISTOPHER AMIN: Yes.
DANIEL KARRENBERG: Because I would really like to see that.
CHRISTOPHER AMIN: That would be part of it.
DANIEL KARRENBERG: Let me add a call for people who have hardware probes and are willing to install software probes to come forward.
CHRISTOPHER AMIN: Yeah, exactly, if people are going to volunteer it's even better if they have hardware probes or we can make sure they do.
BRIAN TRAMMELL: Speaking in my capacity as the owner of probe 14 mumble, I forget the other numbers. Talk to me, I would like to help with the co‑location of the hardware and software probe thing. So that raised another question: So, you are talking about packaging and you are kind of waving your hands about what you mean, can I infer from that that when you are talking about software probes it's not a hey we are going to have a hosted VM or container it could even be, you know, you have a machine and it's running other stuff on it, you could also install this as a Daemon that is running in the ‑‑
CHRISTOPHER AMIN: That is the sort of principle. But I think the test would be useful in this regard, because if it turns out that we might at least make recommendations like it would be better this software runs on uncontested resources and so on but we don't really know at the moment but it should be something like install RIPE Atlas and goes from there.
BRIAN TRAMMELL: I would be happy to help testing with this, even Doper versus VVM and VM ‑‑ whatever the thing is called, I forget, and R versus like machine ‑‑ I would be happy to help.
RANDY BUSH: IIJ and Arrcus again. I think possibly trying to deal with Daniel's scaling problem and Jen's desire for phones to try to compromise if we did v6 only it required DHCP 6.
CHRISTOPHER AMIN: Okay. Thank you.
BRIAN TRAMMELL: So, thank you all very much, especially thanks to the speakers, many of whom came to us with 20 to 30‑long presentations and we said please cut them to 12, which is sort of a brutal amount of compression, thank you very much for helping keep us to time today. We have a couple of minutes if anyone has any other business or open mic for the MAT Working Group. No. Okay. Thank you very much, enjoy your evening and we will see you in Rotterdam.
LIVE CAPTIONING BY AOIFE DOWNES, RPR