gekk.info « articles

Basics of PPP

Packet capture of PPP session

Packet capture of a PPP session starting up

Listen to this article: Listen to this article
(Article is not guaranteed to be identical to audio)

Table of Contents:

Preamble
Fundamentals
    Framing
    Higher level protocols
    SLIP
PPP Itself
    The basic PPP process
    The Unix Method
        Interactive Logons
        SLiRP
Conclusion

Preamble

I've worked with PPP many times in my life - primarily in the PPP over Ethernet (PPPoE) form in my job as a T1/DSL ISP technician, but also in a few other contexts, including, of course, as a dialup internet subscriber. More recently however I began looking into how dialup internet connections work at a nuts and bolts level, with the idea in mind that I might set up some kind of small dialup ISP for a hobby project.

Prior to this I had ignored PPP as simply "something that dialup uses, among other things, somehow" but once I began working with the nitty gritty of dialup networking I found that I had to learn how it worked at a low level. I was able to turn up a basic dialup connection between Windows machines without knowing much, but it turns out that this is not because PPP is simple, but because Windows uses the same default settings on its servers and clients, which obscures this entire protocol from the admin.

When I began trying to connect non-Windows clients, I immediately began to run into confusing situations. For instance, my Macintosh Performa 630 would dial up, then perform no authentication and simply declare the session up even though the server disagreed with this. It took me most of a day to figure out why, and I didn't get an inch closer to an answer until I simply buckled down and began reading the raw protocol specifications.

Of course I googled PPP, read about it on Wikipedia, but I found the dry, contextless summary there unsatisfying. I didn't understand what role it played in networking, how it interacted with Ethernet or IP, what the host was doing to provide the service, and why there were so many settings. As I would discover later, there's also two fundamentally different ways of turning up PPP sessions that the Wiki page doesn't mention at all.

I couldn't see a straightforward way to edit the article to add this information, so I figured making my own page might be a better choice. This is mostly for the sake of entertainment, since dialup networking is largely dead, though as far as I know this information is applicable to PPPoE as well, so you might find some of it useful if you work at an ISP.

If you want a fully technical understanding of PPP, I imagine you can't do much better than the O'Reilly book, which you can borrow from Internet Archive here. The article you're reading here, however, is written to help the already-capable network technician understand the role PPP plays. I hope you find it interesting or useful.

Fundamentals

PPP was born out of a need to organize data being sent over serial links. This is connected to some fundamental limitations of serial, and to the OSI layer model that you might have learned about before (but barely understood, given the quality of most teaching,) so going over both will be helpful.

Framing

In other articles I've explained that serial (RS-232) connections are simple "byte pipes," over which the basic transfer unit is one byte. To state it clearly: every transaction over serial is always exactly one byte.

If you have two computers connected via a serial link, you must send one byte - eight bits - over the link before the receiving machine will do anything with them - its serial controller waits for eight bits to come across before it tells the receiving computer that any data has come in. If you were to somehow send less, you would either get no response, or a corrupted, meaningless output at the far end.

Four bits, for instance, are meaningless in the language of serial, because in both the sending and receiving machine, the hardware and software components that process serial signaling only speak in whole bytes, and are simply incapable of being given less - or more - data.

The latter part there is crucial: there's no way to make a single transaction over serial that is larger than one byte, within the constraints of RS-232 itself.

Suppose you are sending a 60-character message over a serial port. There's no built-in ceremony in the serial port signaling to give any details about your message.
You don't have a way to tell the receiving computer "okay, a message is starting," and you don't have a way to tell it "alright, that was the whole message." If the receiving computer wants to accumulate your entire message before displaying it to the user, how does it know when all of it has arrived? It needs a way to know where the beginning and end are.

This problem is endemic to all networking; in fact, nearly all computer communication. Connections between computers, or even between devices in a computer (like your PCI cards) are simply electrical signals toggling on and off, and in order for it to make any sense, the two ends of the conversation need a previously agreed upon way to interpret those signals.
This is usually called framing when it happens at an electrical level, and a protocol when it happens at higher layers of abstraction. Sometimes you'll see these terms used interchangeably.

In Ethernet, for instance, the way that bits are actually sent down the wire as on-and-off voltage pulses is called physical layer framing. It's how the receiving computer knows which pulses represent a byte, and it's also how it knows where one Ethernet message ends and another one begins. There are special patterns that let the network interface card understand that the sending computer is beginning or ending a transaction, so that the receiving machine can bundle up all the bytes that represent that transaction - known as an Ethernet frame.

Once the frame is received, the computer still only has a pile of bytes. It knows how many bytes were part of the frame, but it doesn't know what any of them mean. This is where the Ethernet data link protocol takes over. It knows that certain bytes tell it who the frame came from, other bytes tell it who the frame is intended to reach and finally what's in the frame.

If you ever learned about the OSI model, you may recognize these as layers one and two. Not much is exactly designed to fit the OSI model, but it is a useful way to think about this subject, and it maps nicely to what we're about to talk about.

Higher level protocols

In serial communications, RS-232 is the layer one protocol. It specifies a way to make electrical impulses out of bytes, and vice versa.

In early computing, RS-232 was not used as a networking protocol. It was simply a way of connecting computers directly to each other, at which point the "protocol" was usually a human typing on a keyboard. The program executed on the far end was designed with this in mind, and had its own ways of interpreting what the user was typing, so there was no framing beyond throwing bytes back and forth.

There were automated systems that communicated over serial, but all of them were designed with the expectation that they knew exactly what they were talking to. If you have a motion sensor connected to a central alarm system over RS-232, it has its own dedicated port which the central system knows goes to "sensor number 3," and both the sensor and the system know exactly how to talk to each other at all times. One will only send messages that the other is designed to understand.

When people began building computer networks, they didn't use RS-232 for the core connectivity. They built new media, like Ethernet, and when they began building larger networks, like the Internet, they designed a new protocol, Internet Protocol. These both made large networks viable, because every packet sent from one computer to another had lots of information about who it came from, where it was going, and what was inside of it. If you were at a university with a direct Ethernet connection, connecting to the Internet when it came along was trivial.

For basically everyone who wasn't at a university or business with Ethernet however, connecting to these new networks - the Internet, and others using its technology - required a modem. Networks connected to each other over a long distance or in an ad-hoc fashion also would have been forced to use dial-up links over modems. This was a problem because, as I discussed in my other article, modems simply turn a phone call into "a very long serial cable," and provide no function beyond that.

A serial connection is not a great way to send packets back and forth. You can't send Ethernet over it, because it simply doesn't have the properties of Ethernet, electrically or logically. It doesn't even have framing beyond single bytes. You need a protocol that will turn the "byte pipe" into something that sends whole transactions.

SLIP

The earliest standard solution to this (although it literally called itself a "nonstandard") was called SLIP, which was published in 1988 and took the place of Ethernet in the network stack. It was whipped up in a hot hurry to solve the problem in the quickest, dirtiest way possible, and it really did solve it in the quickest, dirtiest way possible.

SLIP provided framing to serial connections, but it was just enough that a computer could hurl raw IP packets down a serial connection and the receiving machine would know where each one ended. Absolutely everything else had to be handled in other ways. There was no ability to negotiate how a connection would work - the calling machine needed to know what IP address it was supposed to have, for instance. There was also no authentication built in; a topic we will discuss in detail in the "Authentication" section.

In researching all this, I read the SLIP RFC, and if you're remotely technical you should too, because it's a hoot. The author as much as says "Don't use this, something better is coming."
It's hard to even call it a protocol. It is, and I am not joking here, absolutely nothing more than IP packets shot down a serial line, raw, with a two-byte sequence (called "END") at the end of each one.

Every time the receiving machine sees an END sequence, it just wraps up everything it received since the last END, says "yep, that's a packet" and hands it off to the OS IP layer.
There's literally nothing else in the protocol definition, except to say that if the magic END sequence happens to occur inside of an actual packet, it has to be preceded by an "escape" byte.

This is such a barebones "protocol" that it actually has no way to handle line noise.
In virtually every network protocol that exists, there is some kind of checksum field to confirm that data wasn't damaged during transit. SLIP doesn't have this, but the RFC says that someone suggested just sending an END at the beginning of every packet.
That way, if any spurious line noise had produced erroneous data since the last packet had been sent, the receiving machine would wrap those up into a "packet," hand it to the IP layer - which would promptly throw it away since it was impossible to parse as IP - and move on with a clean slate.

SLIP was extremely limited and didn't last long. By 1990, the PPP standard was published, and it provided much more sophisticated solutions to all of these problems.

PPP Itself

In summary, (especially if you skipped right to this part):
PPP is a layer two protocol that replaces Ethernet in the network stack so that IP packets can be sent over plain serial connections, usually over a modem link.
It's actually a pretty nice one, all things considered.

The first feature of PPP is that it provides framing on the serial link - it has well defined byte sequences that mark the beginning and end of a packet, and then a number of fields that describe its contents.
This framing is not actually unique - in fact, it's almost identical to another existing standard, HDLC.

HDLC isn't specifically an IP transport, it was just a known method to "packetize" data on a serial connection. It could have been used just like SLIP was - to just cram IP packets into the serial pipe and let the other machine unbundle and transmit them onto the destination network, but there were other features users and networks needed, requirements that were unique to dialup service.

They needed:

PPP solves all of this by providing a suite of sub-protocols, if you will - initially just LCP and NCP, which are used to set up the initial connection.
Later standards provided IPCP, CCP, MP, PAP, CHAP, and many other things to fill in all these gaps.

The basic PPP process

Here's the meat of it - what you were probably looking for when you came here. What actually happens when you dial up and establish a PPP session?

Well, there's actually two versions of how things can go down. Here's how a very modern (Windows XP to Server 2k3) exchange looks:

  1. Your PC tells your modem to dial your ISP
  2. Your ISP's modem receives a RING and answers
  3. The two modems handshake, and now there's a serial connection between your system and theirs
  4. On your PC, the PPP client sends the initiating PPP packet, an LCP Configuration Request which tells the host what features it has
  5. The host sends back an LCP Configuration Request saying what features it has
  6. Client and host can now send LCP Configuration Reject if they don't like something the other wants, and this can proceed until both sides agree on a set of features.
  7. The client sends an LCP Identification packet with it's hostname, among other things
    1. This is a much later feature (added in 1994); it is not used for authentication, it just helps the host know who it's talking to so it can make decisions even before authentication is possible.
  8. The client authenticates. There are many protocols for this:
    1. PAP, in which the client simply sends a single packet containing a username and password, and the host either accepts or rejects it.
    2. CHAP, which provides some level of security by hashing the password.
    3. EAP, which extends authentication to support all kinds of wacky auth schemes, some perhaps even a little secure
    4. MS-CHAP / MS-CHAPv2, Microsoft's variants of CHAP
    5. Probably countless other options
  9. The client and server now exchange more CCP packets to negotiate data compression - very important, because it can more than double the effective speed of a connection
  10. The client sends a IPCP Configuration Request in order to ask for an IP address
    1. This is similar to DHCP in that the client can ask for a specific IP, which the server may allow or deny
    2. If the client sends 0 in the request, the server will pick an address of its own choice.
  11. The server responds with an IPCP Configuration Ack to give the client an IP address and DNS server addresses.
  12. The connection is now established. IP packets begin flowing, and when they arrive at the host end, they're forwarded onto the IP network.

There also could be steps in here which would establish encryption if enabled, but I don't know exactly how those look.

This is a pretty slick process. It's quick, flexible, and has lots of room for expansion - a well designed protocol, at least in my eyes, and one which adds very little overhead; about eight bytes per packet on a good day, which isn't nothing over serial, but could be much worse.

There's a totally different way of setting up PPP which is worth discussing however, because it influenced the design of all PPP software, and that's the Unix approach.

The Unix Method

You should have some context for this. It's half the fun.

People had been dialing in to university, business, and network service providers systems for decades by the time the Internet was created. Shared Unix systems in particular were commonly available throughout the eighties, and when the Internet came along, they were the first machines to be connected to it.

You should keep in mind that at this time - circa 1989, say - home computers were still pathetic in comparison to virtually any Unix machine. In addition to simply having a more sophisticated OS than any PC was running, you could expect Unix machines to have much, much more speed and storage, as well as already having access to things like Usenet and email.

At that time, it was not typically the case that you would have, say, an email client on your home computer. Instead, you would dial in to a Unix machine owned by your employer, school, or another provider with access to the nascent wide area networks of the 80s, log in to a shell account, and then run your email client on the remote machine.

The shell account gave the connected user access to a Unix shell - very much like the /bin/sh we still have on Linux - from which they could do anything they liked on the remote machine, but only interactively, as if they were sitting at a keyboard connected to the Unix machine itself. Your home computer could not "join" the larger network as a peer - once dialed in to a shell, it was nothing more than a terminal attached to a computer that was a fully fledged member of the growing networked world.

With the introduction of IP it started to become sensible that your machine at home could be a member of the Internet in its own right. IP enabled network operators to assign addresses to users joined over ad-hoc connections and route them through arbitrarily complex networks. The most obvious way to do this was to upgrade the existing Unix systems to offer that service to dialed-in users, and that's what SLIP, and later PPP enabled.

Using the exact same methods - a modem connected to a serial port - the Unix systems could now offer Internet connectivity. The problem was, what if you didn't want that?

Interactive Logon

Even after Internet access was available, not everything was on the Internet, and people didn't instantly move all their tasks to their home computer. Many never did - in fact, many still haven't. You might know someone who has a perfectly functional Windows, Mac or Linux computer in front of them, but does almost everything through an SSH session to some server. They have their reasons, and people did in 1989 as well.

Unix systems couldn't stop serving up shell accounts, which presents a problem, because if you dial in to a system like that, how does it know whether you want a PPP session or a shell session? With a Windows server, or a dedicated ISP server, it's no problem - as soon as a user connects you just blast a PPP LCP packet at them and wait for their response. But with a Unix system, in the late 80s, you actually have to assume that most users don't want internet access, but offer a way to get it if they want it.

The solution was to implement PPP as a program a user could run when they wanted internet access. A user with a normal shell account could dial up, connect, check their email and newsgroup updates, and then, if they wanted, type "ppp". This would launch a PPP server process on the Unix machine, which would immediately spit out a very distinctive PPP opening salvo:

!}#@!}!}*}}8}!}$}%\}"}&} } } } }%}&T|}>M}~}"}(}"7z~

The PPP client software running on their machine would recognize this, take over the serial connection and complete the PPP exchange and initiate an internet connection. The consequence of this is that all PPP clients had to (and still have to) support this capability.

The PPP client for classic MacOS, for instance, MacPPP, supports an option to display a Terminal window in which you can manually tell the modem to dial, wait for the Unix login prompt, type your username and password, then type "ppp" and, once you see that distinctive string of gibberish, click OK to tell the PPP client to start the negotiation process. I'm not sure if this was actually used by many people or was more of a diagnostic tool.

You couldn't easily automate the login process to a Unix system because it was, after all, simply bytes being thrown at your machine. If you took the naive approach, your PPP client could just wait for the modem handshake, then fire your username and password down the line, followed by the pppd command. The problem is you'd never get the timing right if you did this, because login prompts usually looked like this:

Welcome to the Berkeley South Campus Unix user system!
WARNING: This system is for use by students and faculty of Berkeley University! Unauthorized access will be recorded and prosecuted!

Username:

If you tried to automate this naively, your machine would probably punch in your username about when the remote system was halfway through the word "Welcome" - or it would send the pppd command while the system was busy authenticating your password, and it would just be absorbed and ignored.

The solution to this was something called a chat script. If you've ever used the Linux utility expect, this works just like it. This is a script that you configure with the exact messages that the server is expected to output. It's how you tell the PPP client what to wait for to know when it's safe to send your username, password, and whatever other commands are needed to start the connection.

Below is an example chat script from MacPPP.

Setting up a PPP connection on a Mac

The "wait" lines mean "wait for this text," and the "out" lines mean "send this to the server."

You'll notice a very important detail here: The username and password fields on this Unix machine do not simply say "username," they say "Annex username."
Some systems also simply used the string "Username" - with a capital U, which the chat script needed to account for.
As far as I can tell nobody was ever completely comfortable with this, even if they knew which capitalization their server used, because every chatscript I've seen only checks for the partial string "sername"... and the string "assword", so this is all worth it.

The shell prompt is also not a "#" like it is on many Unix systems - it's been suggested to me that this is probably a Bay MicroAnnex XL or similar device, a type of "terminal server" or "remote access server."

The point is, you'd need to customize this to your specific dialup internet server, and if that server ever changed behavior, your script would stop working... and you wouldn't know why. It would just sit there and wait, and do nothing, and all you'd be able to do is log in to the machine manually to figure out what changed.

It was a wild time.

SLiRP

As a footnote to this topic, we should talk about SLiRP, because it's actually still somewhat relevant.

PPP, as an open protocol, was of course implemented in a lot of different ways. Unix had ppp, Solaris had aspppls, and I'm sure other Unix variants had their own implementations. All of these were operating system features, and operated hand in hand with the OS itself. This meant you needed to have permission to use them.

Suppose you didn't have permission to run PPP. You had access to a shell account on a Unix machine connected to the internet, you could run your own IP-capable programs on it, but you weren't allowed to initiate PPP, connect your home computer, and download porn with Mosaic 1.0. This was frequently the case, probably because system access sold as a shell account would have cost much less in terms of bandwidth than directly connecting a home computer. These would have been separate tiers of service, and some system operators wouldn't have wanted to provide direct internet access period.

Well, the enterprising computer nerd has no time for restrictions of that sort, even if they cost someone else a lot of money, and so along came SLiRP.
SLiRP was written by some Australian guy around 1995 or so, distributed for free (so, probably much more popular than the commercial program TIA which did much the same thing a little earlier), and it simply cheats the system.

You would download this program, compile it within your Unix shell account, and execute it just like the ppp command. It would then negotiate PPP (or SLIP, if you wanted that for some reason) exactly the same as the native ppp server, and voila, you're on the internet.

The difference is that, unlike the ppp server, which interacts with the OS IP stack to obtain a dedicated IP and therefore make your home computer a first-class citizen, SLiRP actually implemented an early form of NAT. Since programs running under your shell account had the right to make their own IP connections, SLiRP proxied all the packets from your home system through its own process so it could operate without the OS' assistance.

SLiRP is interesting because you can still use it without much configuration today. I experimented with it to get a Mac online using a Raspberry Pi and a plain serial connection, and it worked... well... it worked okay. It wasn't great, for reasons I'll detail at a later date in an article I'm still working on, but it did work, and all I had to do was install one program, not create a bunch of configs and a virtual network interface and who knows what else.

You might also find it useful to make two old computers talk to each other - setting up PPP on an HP-UX system is probably a pain, but if you can get the SLiRP code on there, you can get an IP connection to anything you want to plug into the serial port.

Conclusion

I'm very new to all of this and I may have gotten some things wrong, especially the historical context, which is very hard to find out from scratch at this point. If you have firsthand experience and have a factual disagreement with any of this, please let me know. I was barely conscious when most of this went out of style.

With that said - not to totally excuse myself from responsibility - this topic mostly just doesn't matter anymore, so I wrote this more for entertainment than for instruction. It's accurate to the best of my knowledge, but I've speculated and cut corners in a few places, and I have not yet even finished the O'Reilly PPP book. I find it easier to write as I learn, however, because you're never done learning.


If this was interesting to you, or if you did something interesting with it, email me: articles@gekk.info

If you like my work, consider tossing me a few bucks. It takes a lot of effort and payment helps me stay motivated.

List of Articles