Good Idea: Slash Government-backed Student Loans

Why is the price of college going through the roof? They could be rising due to increased costs. Or they could be rising because that is what the market will bear.

Mark Cuban proposed this recently, and I think he’s right. The government keeps throwing more money to students. Awash with cash, those students then go to school and spend it. With plenty of eligible students, the schools simply raise rates, the government loans more, and the cycle repeats. Shall we stop the madness? Stop giving out massive government student loans and this problem could evaporate entirely.

Here’s a fun info-graphic about college loans.

And here’s Mark:

3. Limit the Size of Student Loans to $2,000 per year

Crazy ? Maybe, maybe not. What happened to the price of homes when the mortgage loan bubble popped ? They plummeted. If the size of student loans are capped at a low level, you know what will happen to the price of going to a college or university ? It will plummet. Colleges and universities will have to completely rethink what they are, what purpose they serve and who their customers will be. Will some go out of business ? Absolutely. That is real world. Will the quality of education suffer ? Given that TAs will still work for cheap, I doubt it.

Now some might argue that limiting student loans will limit the ability of lower income students to go to better schools. I say nonsense on two fronts. The only thing that allowing students to graduate with 50k , 80k or even more debt does is assure they will stay low income for a long, long time after they graduate ! The 2nd improvement will be that smart students will find the schools that adapt to the new rules and offer the best education they can afford. Just as they do now, but without loading up on debt.

The beauty of capitalism is that people like me will figure out new and better ways to create and operate for profit universities that educate as well or better as today’s state institutions, AND I have no doubt that the state colleges and universities will figure out how to adapt to the new world of limited student loans as well.

Finally, the impact on the overall economy will be ENORMOUS. There is more student loan debt than credit card debt outstanding today. By relieving this burden at graduation, students will be able to participate in the economy

Macs Are So Easy To Use

I’m still struggling with my mac, but I am getting better.  Nonetheless, this week’s UI episode was so comical I decided to post it.

The Task:  By default, the screen saver locks me out of my Mac every 5 minutes!  Make it not timeout for 15-30 minutes and not force the immediate password.  This shouldn’t be too hard, right?

Step #1:  Go to the System Preferences, click on “Desktop and Screen Saver”, and move the slider from 5 minutes to 15 minutes.  Great!  Hey, they even let me choose from a bunch of cute screen saver options!  Nice.

Disappointment #1:  It didn’t work!  My machine still locks me out after 5 minutes.

At this point, it took me days before realizing that it was probably the Mac equivalent of power-saving mode causing the problem.  If I weren’t a techie, I never would have thought of this.  But hey, no problem, I can fix that…

Step #2: Go to the System Preferences again, and find the “Energy Saver”!  Great!  Sure enough, the defaults are too low.  Extend them out.  Nice.  Now I must be all done.

Disappointment #2:  OK – it partially worked, but shoot, it’s still prompting me for my password all the time.  How do I fix that?

I spent a lot of time combing through both the screen saver and also the energy saver panels.  But I couldn’t find where they have an option to not force password prompting.  Searching through other panels didn’t discover anything either.

Finally, Google solved the problem.

Step #3: Go to the System Preferences “Security” tab, with “Require password [immediately] after sleep or screen saver begins.”  Change that value.

Alas, 3 days, 3 control panels, and 1 Google search later, I no longer have my Mac locking me out all the time.  Score one up for Apple’s mastery of User Interface design!  Hope this helps someone else someday too.

mac.prefs2

IPv6 DNS Lookup Times

A couple of weeks ago, I posted an article about slow IPv6 DNS query performance.  Several readers suggested in the comments that my observations were isolated to a few bad implementations and that perhaps Mac and Linux systems were not prone to this.  I hoped they were right, but I now have data to show they’re wrong.

Measuring performance is routine in Chrome, so a couple of days ago I was able to add a simple test to measure the speed of pure IPv4 lookups (A record) vs IPv6 lookups (A + AAAA records).  Today, with hundreds of millions of measurements in hand, we know the impact of IPv6 on DNS.

ipv6dns

Windows users face a ~43% DNS latency increase and Mac users face a 146% DNS latency increase when they install an IPv6 client-side address on their machines. 

Today, there are two basic approaches to the IPv6 lookup problem:  you can issues the requests in serial or in parallel.  Obviously, issuing in serial will be more latency impacting than doing them in parallel.  But, even issuing them in parallel will be slower, as the maximum latency of two samples along a normal curve will be lower than the average latency of a single sample along the same curve.

Some readers may argue that these are dominated by older implementations.  Further investigation into the data does not confirm this.  Even comparing the fastest 10% of Mac or Windows users (which should hopefully be using the best DNS resolver algorithms available) are seeing greater than 100% DNS lookup latency increases with IPv6.  Of course, we have conflation here, as DNS server response times may be slower in addition to the DNS client double-lookup being slower.  A deeper study about why these results are so bad is warranted.

Summary

Performance-sensitive users should be cautious about assigning a global IPv6 address on their clients.  As soon as they do, the browser will switch into IPv6 lookup mode, and take on a 40-100% increase in DNS latency.

Firefox Idle Connection Reuse

httpwatch does some anecdotal testing to conclude that Firefox’s new algorithm for selecting which idle connection to reuse has some strong benefits.

This is great stuff, and in general it should definitely help.  This is part of why getting to one-connection-per-domain is an important goal.  HTTP’s use of 6 or more connections per domain make it so that each connection must “warm up” independently.  A similar algorithm should land in Chrome soon too.

Fortunately, there is a protocol for this stuff too 🙂  Hopefully firefox will pick that up soon too.

SSL: It’s a Matter of Life and Death

Over a year ago, when we first announced SPDY, the most prominent criticism was the requirement for SSL. We’ve become so accustomed to our unsecure HTTP protocol, that making the leap to safety now seems daunting.

Since that time, many things have happened, and it is now more clear than ever that SSL isn’t an option – it’s a matter of life and death.

SSL was invented primarily to protect our online banking and online purchasing needs.  It has served us fairly well, and most all banks and ecommerce sites use SSL today.  What nobody ever expected was that SSL would eventually become the underpinnings of safety for political dissidents.

Last year, when China was caught hacking into Google, were they trying to steal money?  Two months ago, when Comodo was attacked (and suspected the Iranian government), did they forge the identities of Bank of America, Wells Fargo, or Goldman Sachs?  No.  They went after Twitter, Gmail, and Facebook – social networking sites.  Sites where you’d find information about dissidents, not cash.  To say that these attacks were used to seek and destroy dissidents would be speculation at this point.  But these incidents show that the potential is there and governmental intelligence agencies are using these approaches.  And of course, it is well known fact that the Egyption government turned off the Internet entirely so that their citizens could not easily organize.

The Internet is now a key communication mechanism for all of us.  Unfortunately,  users can’t differentiate safe from unsafe on the web.  They rely on computer professionals like us to make it safe.  When we tell them that the entire Web is built upon an unsecured protocol, most are aghast with shock.  How could we let this happen?

As we look forward, this trend will increase.  What will Egypt, Libya, Iran, China, or Afghanistan do to seek out and kill those that oppose them?  What does the US government do? 

Fortunately, major social networking sites like Facebook and Twitter have already figured this out.  They are now providing SSL-only versions of their services which should help quite a bit.

So does all this sound a little dramatic?  Maybe so, and I apologize if this sounds a bit paranoid.  I’m not a crypto-head, I swear.  But these incidents are real, and the potential is real so long as our Internet remains unsecure.  The only answer is to secure *everything* we do on the net.  Even the seemingly benign communications must be encrypted, because users don’t know the difference – and for some of them, their lives are at stake.

SSL FalseStart Performance Results

From the Chromium Blog:

Last year, Google’s Adam Langley, Nagendra Modadugu, and Bodo Moeller proposed SSL False Start, a client-side only change to reduce one round-trip from the SSL handshake.

We implemented SSL False Start in Chrome 9, and the results are stunning, yielding a significant decrease in overall SSL connection setup times. SSL False Start reduces the latency of a SSL handshake by 30%1. That is a big number. And reducing the cost of a SSL handshake is critical asmore and more content providers move to SSL.

Our biggest concern with implementing SSL False Start was backward compatibility. Although nothing in the SSL specification (also known as TLS) explicitly prohibits FalseStart, there was no easy way to know whether it would work with all sites. Speed is great, but if it breaks user experience for even a small fraction of users, the optimization is non-deployable.

To answer this question, we compiled a list of all known https websites from the Google index, and tested SSL FalseStart with all of them. The result of that test was encouraging: 94.6% succeeded, 5% timed out, and 0.4% failed. The sites that timed out were verified to be sites that are no longer running, so we could ignore them.

To investigate the failing sites, we implemented a more robust check to understand how the failures occurred. We disregarded those sites that failed due to certificate failures or problems unrelated to FalseStart. Finally, we discovered that the sites which didn’t support FalseStart were using only a handful of SSL vendors. We reported the problem to the vendors, and most have fixed it already, while the others have fixes in progress. The result is that today, we have a manageable, small list of domains where SSL FalseStart doesn’t work, and we’ve added them to a list within Chrome where we simply won’t use FalseStart. This list is public and posted in the chromium source code. We are actively working to shrink the list and ultimately remove it.

All of this represents a tremendous amount of work with a material gain for Chrome SSL users. We hope that the data will be confirmed by other browser vendors and adopted more widely.


1Measured as the time between the initial TCP SYN packet and the end of the TLS handshake.

IPv6 Will Slow You Down (DNS)

ipv6-logo When you turn on IPv6 in your operating system, the web is going to get slower for you.  There are several reasons for this, but today I’m talking about DNS.  Every DNS lookup is 2-3x slower with IPv6.

What is the Problem?
The problem is that the current implementations of DNS will do both an IPv4 and an IPv6 lookup in serial rather than parallel.  This is operating as-per the specification.

We can see this on Windows:

     TIME   EVENT
       0    DNS Request A
www.amazon.com
      39    DNS Response www.amazon.com
      39    DNS Request AAAA www.amazon.com
      79    DNS Response www.amazon.com
       <the browser cannot continue until here>

The “A” request there was the IPv4 lookup, and it took 39ms.  The “AAAA” request is the IPv6 lookup, and it took 40ms.   So, prior to turning IPv6 on, your DNS resolution finished in 39ms.  Thanks to your IPv6 address, it will now take 79ms, even if the server does not support IPv6!  Amazon does not advertise an IPv6 result, so this is purely wasted time.

Now you might think that 40ms doesn’t seem too bad, right?  But remember that this happens for every host you lookup.  And of course, Amazon’s webpage uses many sub-domain hosts.  In the web page above, I saw more of these shenanigans, like this:

     TIME   EVENT
       0    DNS Request A
g-ecx.images-amazon.com
      43    DNS Response g-ecx.images-amazon.com
      43    DNS Request AAAA g-ecx.images-amazon.com
     287    DNS Response g-ecx.images-amazon.com

Ouch – that extra request cost us 244ms.

But there’s more.  In this trace we also had a lookup for the OCSP server (an Amazon’s behalf, for SSL):

     TIME   EVENT
       0    DNS Request A
ocsp.versign.com
     116    DNS Response ocsp.versign.com
     116    DNS Request AAAA
ocsp.versign.com
     203    DNS Response ocsp.versign.com

Ouch – another 87ms down the drain.

The average website spans 8 domains.  A few milliseconds here, and a few milliseconds there, and pretty soon we’re talking about seconds.

The point is that DNS performance is key to web performance!  And in these 3 examples, we’ve slowed down DNS by 102%, 567%, and 75% respectively.  I’m not picking out isolated cases.  Try it yourself, this is “normal” with IPv6.

What About Linux?
Basically all of the operating systems do the same thing.  The common API for doing these lookups is getaddrinfo(), and it is used by all major browsers.  It does both the IPv4 and IPv6 lookups, sorts the results, and returns them to the application.

So on Linux, the behavior ends up being like this:

     TIME   EVENT
       0    DNS Request AAAA www.calpoly.edu
      75    DNS Response www.calpoly.edu
      75    DNS Request A www.calpoly.edu
      93    DNS Response www.calpoly.edu

In this particular case, we only wasted 75ms, when the actual request would have completed in 18ms (416% slower).

But It’s Even Worse
I wish I could say that DNS latencies were just twice as slow.  But it’s actually worse than that.  Because IPv6 is not commonly used, the results of IPv6 lookups are not heavily cached at the DNS servers like IPv4 addresses are. This means that it is more likely that an IPv6 lookup will need to jump through multiple DNS hops to complete a resolution. 

As a result, it’s not just that we’re doing two lookups instead of one.  It’s that we’re doing two lookups and the second lookup is fundamentally slower than the first.

Surely Someone Noticed This Before?
This has been noticed before.  Unfortunately, with nobody using IPv6, the current slowness was an acceptable risk.  Application vendors (namely browser vendors) have said, “this isn’t our problem, host resolution is the OS’s job”.

The net result is that everyone knows about this flaw.  But nobody fixed it.  (Thank goodness for DNS Prefetching!)

Just last year, the “Happy Eyeballs” RFC was introduced which proposes a work around to this problem by racing connections against each other.  This is an obvious idea, of course.  I don’t know of anyone implementing this yet, but it is certainly something we’re talking about on the Chrome team.

What is The Operating System’s Job?
All browsers, be it Chrome or Firefox or IE, use the operating system to do DNS lookups.  Observers often ask, “why doesn’t Chrome (or Firefox, or IE) have its own asynchronous DNS resolver?”  The problem is that every operating system, from Windows to Linux to Mac has multiple name-resolution techniques, and resolving hostnames in the browser requires using them all, based on the user’s operating configuration.  Examples of non-DNS resolvers include:  NetBIOS/WINS, /etc/hosts files, and Yellow Pages.  If the browser simply bypassed all of these an exclusively used DNS, some users would be completely broken.

If these DNS problems had been fixed at the OS layer, I wouldn’t be writing this blog post right now.  But I don’t really blame Windows or Linux – nobody was turning this stuff on.  Why should they shine a part of their product that nobody uses?

Lesson Learned:  Only The Apps Can ‘Pull’ Protocol Changes
IPv6 deployment has been going on for over 10 years now, and there is no end in sight.  The current plan (like IPv6’s break the internet day) is the same plan we’ve been doing for 10 years.  When do we admit that the current plan to deploy IPv6 is simply never going to work?

A lesson learned from SPDY is that only the applications can drive protocol changes.  The OS’s, bless their hearts, can only do so much and move too slowly to push new protocols.  There is an inevitable chicken-and-egg problem where applications won’t use it because OS support is not there and OSes won’t optimize it because applications aren’t there.

The only solution is at the Application Layer – the browser.  But that may be the best news of all, because it means that we can fix this!  More to come…

IPv6 latency analysis coming….

Over the next few days, I’m going to be posting some blogs about IPv6 performance.

The results are pretty grim – but my aim is not to make everyone despair.

There is a solution, and I think I can see light at the end of the tunnel.  My theory is that we’ve been approaching IPv6 deployment incorrectly for the last 10 years.  It seems obvious now, but it wasn’t obvious 10 years ago, and things have certainly changed which enable this new mechanism. 

IPv6 break-the-world-day is approaching quickly.  If you haven’t started thinking about this, you should.

How to Get a Small Cert Chain

chain After my last article illustrated the length of our Certificate Chains, many people asked me “ok – well how do I get a small one?”. 

The obvious answer is to get your certificate signed as close to the root of a well-rooted Certificate Authority (CA) as possible.  But that isn’t very helpful.  To answer the question, lets look at a few of the problems and tradeoffs.

Problem #1:  Most CA’s Won’t Sign At The Root

Most CA’s won’t sign from the root.  Root CAs are key to our overall trust on the web, so simply having them online is a security risk.  If the roots are hacked, it can send a shockwave through our circle of trust.  As such, most CAs keep their root servers offline most of the time, and only bring them online occasionally  (every few months) to sign for a subordinate CA in the chain.  The real signing is most often done from the subordinate.

While this is already considered a ‘best practice’ for CAs, Microsoft’s Windows Root CA Program Requirements were just updated last month to require that leaf certificates are not signed directly at the root.  From section F-2:

All root certificates distributed by the Program must be maintained in an offline state – that is, root certificates may not issue end-entity certificates of any kind, except as explicitly approved from Microsoft.

Unfortunately for latency, this is probably the right thing to do.  So expecting a leaf certificate directly from the root is unreasonable.  The best we can hope for is one level down.

Problem #2: “Works” is more important than “Fast”

Having your site be accessible to all of your customers is usually more important than being optimally fast.  If you use a CA not trusted by 1% of your customers, are you willing to lose those customers because they can’t reach your site?  Probably not.

To solve this, we wish that we could serve multiple certificates, and always present a certificate to the client which we know that specific will trust.  (e.g. if an old Motorola Phone from 2005 needs a different CA, we could use a different certificate just for that client.  But alas, SSL does not expose a user-agent as part of the handshake, so the server can’t do this.  Again, hiding the user agent is important from a privacy and security point of view.

Because we want to reach all of our clients, and because we don’t know which client is connecting to us, we simply have to use a certificate chain which we know all clients will trust.  And that leads us to either presenting a very long certificate chain, or only purchasing certificates from the oldest CAs.

I am sad that our SSL protocol gives the incumbent CAs an advantage over new ones.  It is hard enough for a CA to get accepted by all the modern browsers.  But how can a CA be taken seriously if it isn’t supported by 5-10% of the clients out there?  Or if users are left with a higher-latency SSL handshake?

Problem #3: Multi-Rooting of CAs

We like to think of the CA trust list as well-formed tree where the roots are roots, and the non-roots are not roots.  But, because the clients change their trust points over time, this is not the case.  What is a root to one browser is not a root to another.

As an example, we can look at the certificate chain presented by www.skis.com.  Poor skis.com has a certificate chain of 5733 bytes (4 pkts, 2 RTs), with the following certificates:

  1. skis.com: 2445 bytes
  2. Go Daddy Secure Certification Authority 1250 bytes
  3. Go Daddy Class 2 Certification Authority: 1279 bytes
  4. ValiCert Class 2 Policy Validation Authority: 747 bytes

In Firefox, Chrome and IE (see note below), the 3rd certificate in that chain (Go Daddy Class 2 Certification Authority) is already considered a trusted root.  The server sent certificates 3 and 4, and the client didn’t even need them.  Why?  This is likely due to Problem #2 above.  Some older clients may not consider Go Daddy a trusted root yet, and therefore, for compatibility, it is better to send all 4 certificates.

What Should Facebook Do?

Obviously I don’t know exactly what Facebook should do.  They’re smart and they’ll figure it out.  But FB’s large certificate chain suffers the same problem as the Skis.com site:  they include a cert they usually don’t need in order to ensure that all users can access Facebook.

Recall that FB sends 3 certificates.  The 3rd is already a trusted root in the popular browsers (DigiCert), so sending it is superfluous for most users.  The DigiCert cert is signed by Entrust.  I presume they send the DigiCert certificate (1094 bytes) because some older clients don’t have DigiCert as a trusted root, but they do have Entrust as a trusted root.  I can only speculate.

Facebook might be better served to move to a more well-rooted vendor.  This may not be cheap for them.

Aside: Potential SSL Protocol Improvements

If you’re interested in protocol changes, this investigation has already uncovered some potential improvements for SSL:

  • Exposing some sort of minimal user-agent would help servers ensure that they can select an optimal certificate chain to each customer.  Or, exposing some sort of optional “I trust CA root list #1234”, would allow the server to select a good certificate chain without knowing anything about the browser, other than its root list.  Of course, even this small amount of information does sacrifice some amount of privacy.
  • The certificate chain is not compressed.  It could be, and some of these certificates compress by 30-40%.
  • If SNI were required (sadly still not supported on Windows XP), sites could avoid lengthy lists of subject names in their certificates.  Since many sites separate their desktop and mobile web apps (e.g. www.google.com vs m.google.com), this may be a way to serve better certificates to mobile vs web clients.

Who Does My Browser Trust, Anyway?

All browsers use a “certificate store” which contains the list of trusted root CAs.

The certificate store can either be provided by the OS, or by the browser.

On Windows, Chrome and IE use the operating-system provided certificate store.  So they have the same points of trust.  However, this means that the trust list is governed by the OS vendor, not the browser.  I’m not sure how often this list is updated for Windows XP, which is still used by 50% of the world’s internet users.

On Mac, Chrome and Safari use the operating system provided store.

On Linux, there is no operating system provided certificate store, so each browser maintains its own certificate store, with its own set of roots.

Firefox, on all platforms (I believe, I might be wrong on this) uses its own certificate store, independent of the operating system store.

Finally, on mobile devices, everyone has their own certificate store.  I’d hate to guess at how many there are or how often they are updated.

Complicated, isn’t it?

Yeah Yeah, but Where Do I Get The Best Certificate?

If you read this far, you probably realize I can’t really tell you.  It depends on who your target customers are, and how many obscure, older devices you need to support.

From talking to others who are far more knowledgeable on this topic than I, it seems like you might have the best luck with either Equifax or Verisign.  Using the most common CAs will have the side benefit that the browser may have cached the OCSP responses for any intermediate CAs in the chain already.  This is probably a small point, though.

Some of the readers of this thread pointed me at what appears to be the smallest, well-rooted certificate chain I’ve seen.  https://api-secure.recaptcha.net has a certificate signed directly at the root by Equifax.  The total size is 871 bytes.  I don’t know how or if you can get this yourself.  You probably can’t.

Finally, Does This Really Matter?

SSL has two forms of handshakes:

  • Full Handshake
  • Session Resumption Handshake

All of this certificate transfer, OCSP and CRL verification only applies to the Full Handshake.  Further, OCSP and CRL responses are cacheable, and are persisted to disk (at least with the Windows Certificate Store they are). 

So, how often do clients do a full handshake, receiving the entire certificate chain from the server?  I don’t have perfect numbers to cite here, and it will vary depending on how frequently your customers return to your site.  But there is evidence that this is as high as 40-50% of the time.  Of course, the browser bug mentioned in the prior article affects these statistics (6 concurrent connections, each doing full handshakes).

And how often do clients need to verify the full certificate chain?  This appears to be substantially less, thanks to the disk caching.  Our current estimates are less than 5% of SSL handshakes do OCSP checks, but we’re working to gather more precise measurements.

In all honesty, there are probably more important things for your site to optimize.  This is a lot of protocol gobbledygook.

Thank you to agl, wtc, jar, and others who provided great insights into this topic.