Troubleshooters.Com Presents

Linux Productivity Magazine

March 2008

Socket Programming Intro

Copyright (C) 2008 by Steve Litt. All rights reserved. Materials from guest authors copyrighted by them and licensed for perpetual use to Linux Productivity Magazine. All rights reserved to the copyright holder, except for items specifically marked otherwise (certain free software source code, GNU/GPL, etc.). All material herein provided "As-Is". User assumes all risk and responsibility for any outcome.

Recession Busters:
Twenty Eight Tales of Troubleshooting
Troubleshooting: Just the Facts
Manager's Guide to Technical Troubleshooting
Troubleshooting Techniques of the Successful Technologist

[ Troubleshooters.Com | Back Issues |Troubleshooting Professional Magazine ]



Linux is a wishing well.  --  Steve Litt, from the December 2002 Linux Productivity Magazine

CONTENTS

Editor's Desk

By Steve Litt
How do programs communicate across a network? Primarily, with sockets. If you want two programs to communicate across a network, without using a canned solution like ssh, you'll need sockets.

A socket is an endpoint in a communication across a network. A simplified view is that program A and B connect their sockets, after which program A sends data by writing to its socket, and receives data by reading from its socket.

Sockets can provide a very thin interface between two programs, improving encapsulation and even security if written right. They form the basis of most internetwork client/server programs.

Several years ago I wrote a socket tutorial based on the xinetd system. It was excellent until it stopped working. Also, it was, to a certain extent, black-boxy -- you couldn't really understand the underlying technology. Unlike that old tutorial, this Linux Productivity Magazine issue discusses the structure and function of sockets, programs them in simple C, and doesn't use the xinetd system. By the time you finish this magazine, you'll understand at least the basics of sockets from the inside out.

So kick back and enjoy this magazine, hacking away at socket programming from the familiar to the unknown. And remember, if you use GNU/Linux, or BSD, or Unix or a Unix workalike, this is your magazine.
Steve Litt is the author of Troubleshooting Techniques of the Successful Technologist.   Steve can be reached at his email address.

Basic Terminology

By Steve Litt
Before discussing sockets, let's get some basic terminology defined...
Term Definition
Socket An endpoint in network communications between programs. Generally speaking, two programs communicate. One is called the client, and one is called the server. Client and server programs will be defined later in this glossary.
Two types of sockets Two types of sockets are active and passive.
Active socket A socket used to trade data between two programs, using a connection.
Passive socket A socket used by the server program to listen for clients attempting to connect. When the attempt is detected, the accept() function is used to create a new active socket to service the incoming client. The server needs an active socket for every client with which it's communicating, but it needs only one passive socket to listen for new clients attempting to connect.
Protocol An agreement between programs as to what the data should look like.
TCP A protocol in which first a line of communication (a connection) is established between two programs, and then the two programs begin trading data. TCP guarantees you'll have a connection, or else you'll know you don't have it. It guarantees data is received in the same order it's transmitted, and as long as the connection stays up, it will be received. TCP data is sent as a stream of bytes.
UDP A protocol in which data is traded without first establishing a line of communication. The programs throw data at each other on the assumption that it will be received. If it's necessary to ascertain whether sent data was received, the receiving program must be programmed to send an acknowledgement. UDP data is sent in datagrams instead of a byte stream, and there is no guarantee that the datagrams will be received in the same order as they were sent. UDP does a lot less for the programmer than TCP does, but if the communicating programs are programmed right, it can communicate across unreliable networks that would break TCP connections.
Client/Server An interapplication structure in which one program (the server) serves up data to one or more anonymous other programs (the clients). In socket programming, the server is the program that waits for anonymous clients to connect. The client is the anonymous program that initiates a connection with the known server program.
Server Within the sockets context, this is a program, at a known IP address with a known port number, that waits for client programs to connect to it.  The way it waits is by listening to its port, via a passive socket created with the listen() function. The listening and waiting itself is done by the accept() function.
Client Within the sockets context, this is a program that initiates a connection with a server, using the known IP address and port of that server. Once the connection is established, the client and server can trade data.
Blocking A function is said to be blocking if it "hangs" until receiving a response. The advantage of blocking is a guaranteed synchronization between the function call, the response, and whatever comes after the function call. The disadvantage is that the blocking function can hang the entire program while waiting for a response. There are various programming techniques, including use of the select() function and forking off communication processes with the fork() function, to limit the harm from this disadvantage. Some of the major blocking functions in the sockets world are:
  • accept()
  • recv()
  • send()
  • read()
  • write()
Non-blocking A function is said to be non-blocking if it does not wait for a response before continuing. This eliminates delays, but requires the programmer to track state and synchronization much more tightly. Non-blocking functions in the sockets world include:
  • socket()
  • bind()
  • listen()
  • select()
Basic server function sequence
handle=socket(AF_INET,SOCK_STREAM,0)
The socket() function creates the data structure for a socket, and assigns it to an integer handle, which from then on refers to that socket's data structure. The handle is the functions return value. For TCP sockets, the middle argument is SOCK_STREAM. For UDP sockets, the middle argument is SOCK_DGRAM. On error it returns -1.
Declare and load struct sockaddr_in
The sockaddr_in structure has three important elements you need to
  • sin_addr.s_addr
  • sin_port
  • sin_family
sin_addr.s_addr can name a specific IP address, or wildcard INADDR_ANY. sin_port is a big-endian integer representation of the port number. The htons() function is used on a port number to guarantee that the htons() return value will be a big-endian representation. If you forget to use htons(), For network type sockets, sin_family is always AF_INET.

Once you've filled in this structure, you can pass it (by reference via its address) to functions like bind() and accept().
result=bind(listener_socket,(struct sockaddr*)&addr,sizeof(addr))
bind() binds the socket to a specific port number and maybe a specific IP address, depending on how you set the sockaddr_in structure. Note that the sockaddr_in structure is passed by address so bind() can modify it. The return value is an integer result, and if it's -1, the bind() call failed.
result=listen(listener_socket, NUMBER_OF_CLIENTS_TO_QUEUE)
listen() converts its first argument to a passive socket for listening. The second argument defines the number of clients that can be waiting before it tells the next client it can't handle any more. The recommended value for that second argument is 5.
handle=accept(listener_socket,(struct sockaddr*)&addr,(socklen_t *)&sockaddrlen);
accept() waits for a client to try to connect to its passive socket first argument, and when a client tries to connect, it returns a new handle to an active socket suitable for sending data to and from that client. Note that the second argument is a pointer to sockaddr so it can be modified by accept(), and that the third argument, a length, is passed by address so that accept() can even change the length. If you forget to pass the third argument by address, strange runtime errors will occur.
chars=recv(data_socket, buf, BUFSIZE-2, 0)
This is how you read data from the socket handle. In other words, how you receive the data sent by the other program. The argument at the end is a special flag set that's usually 0.
send(data_socket, sendstring, strlen(sendstring)
This is how you write data to the socket handle. In other words, how you send data to the other program.
shutdown(listener_socket, SHUT_RDWR)
This is how you disable communication through the socket.
close(listener_socket)
This is how you destroy the socket data structure.

Steve Litt is the author of the Troubleshooting: Just the Facts. Steve can be reached at his email address.

The Basic Server Process

By Steve Litt
Here's the basic process performed by a TCP based socket server.
  1. Make the socket data structure (socket())
  2. Declare and fill in the struct sock_addr_in structure
  3. Bind the socket structure to a port (bind())
  4. Convert the socket to a passive listener (listen())
  5. Accept a client and open an active socket for it (accept())
  6. Send and receive data to and from the client (send() and recv())
If you remember the preceding process while looking at server code in this magazine, everything will seem simple. Otherwise, it will seem like gobblety-gook.
Steve Litt is the author of Twenty Eight Tales of Troubleshooting.   Steve can be reached at his email address.

TCP Socket Server Hello World

The following does nothing but create a socket data structure and assign it to an integer handle. It shouldn't be challenging at all. The socket() function, which builds the data structure, assigns it to an integer handle and returns that integer handle. The first argument is always AF_INET for network sockets, the second argument is always SOCK_STREAM for TCP sockets and Unix domain sockets (SOCK_DGRAM for UDP sockets), and the third argument is a bunch of flag bits that for simple applications can be 0.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <netinet/in.h>
#include <unistd.h>

#define SOCK_ERR -1
#define BUFSIZE 100
#define NUMBER_OF_CLIENTS_TO_QUEUE 5
#define PORT 43210 //change if 43210 already in use

void abortt(const char * msg){
printf("\nFATAL ERROR: %s\n", msg);
exit(1);
}

int main(int argc, char* argv[]){
printf("Starting server\n");

// MAKE A SOCKET
printf("Making a socket.\n");
int listener_socket = socket(AF_INET,SOCK_STREAM,0);
if(listener_socket == SOCK_ERR) {
abortt("Could not make a socket!");
}
printf("Listener socket handle=%d\n", listener_socket);

return(0);
}

The preceding code produces the following output, with an integer handle of value 3 returned:

[slitt@mydesk sockets]$ ./a.out
Starting server
Making a socket.
Listener socket handle=3
[slitt@mydesk sockets]$

If it errors out, troubleshoot.

As a next step, declare and fill in the sock_addr_in structure. This won't do anything, but if you make a typing mistake it will fail to compile, alerting you to the problem:
#include <netinet/in.h>
#include <unistd.h>

#define SOCK_ERR -1
#define BUFSIZE 100
#define NUMBER_OF_CLIENTS_TO_QUEUE 5
#define PORT 43210 //change if 43210 already in use

void abortt(const char * msg){
printf("\nFATAL ERROR: %s\n", msg);
exit(1);
}

int main(int argc, char* argv[]){
printf("Starting server\n");

// MAKE A SOCKET
printf("Making a socket.\n");
int listener_socket = socket(AF_INET,SOCK_STREAM,0);
if(listener_socket == SOCK_ERR) {
abortt("Could not make a socket!");
}
printf("Listener socket handle=%d\n", listener_socket);

// DECLARE AND LOAD sockaddr_in STRUCT
printf("Declaring and loading sockaddr_in struct\n");
struct sockaddr_in addr;
addr.sin_addr.s_addr=INADDR_ANY;
addr.sin_port=htons(PORT);
addr.sin_family=AF_INET;
printf("sockaddr_in struct declared and loaded.\n");

return(0);
}

The preceding code should produce the following output:
[slitt@mydesk sockets]$ ./a.out
Starting server
Making a socket.
Listener socket handle=3
Declaring and loading sockaddr_in struct
sockaddr_in struct declared and loaded.
[slitt@mydesk sockets]$

The preceding output just added a couple informational messages.

Now it's time to bind the socket to a port, as defined by the sockaddr_in structure you defined and filled in earlier. If you do this wrong, the call to bind() should return -1. Otherwise, you'll see the informational messages.

The new code added is a call to bind(). The prototype for this function is this:
int bind(int socket, const struct sockaddr *address, socklen_t address_len);
The first argument is the socket handle, the next is the a pointer to sockaddr, which itself is just a superset of one of the following:
Mode struct sockaddr subtype
TCP sockets struct sockaddr_in
UDP sockets struct sockaddr_in
Unix domain sockets struct sockaddr_un

The third argument is the size of the sockaddr structure used (in this case the size of sockaddr_in), so that it can be operated on safely by bind(). On error bind() returns -1 (SOCK_ERR in this code).
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <netinet/in.h>
#include <unistd.h>

#define SOCK_ERR -1
#define BUFSIZE 100
#define NUMBER_OF_CLIENTS_TO_QUEUE 5
#define PORT 43210 //change if 43210 already in use

void abortt(const char * msg){
printf("\nFATAL ERROR: %s\n", msg);
exit(1);
}

int main(int argc, char* argv[]){
printf("Starting server\n");

// MAKE A SOCKET
printf("Making a socket.\n");
int listener_socket = socket(AF_INET,SOCK_STREAM,0);
if(listener_socket == SOCK_ERR) {
abortt("Could not make a socket!");
}
printf("Listener socket handle=%d\n", listener_socket);

// DECLARE AND LOAD sockaddr_in STRUCT
printf("Declaring and loading sockaddr_in struct\n");
struct sockaddr_in addr;
addr.sin_addr.s_addr=INADDR_ANY;
addr.sin_port=htons(PORT);
addr.sin_family=AF_INET;
printf("sockaddr_in struct declared and loaded.\n");

// BIND TO PORT
printf("Binding socket %d to port %d\n", listener_socket, PORT);
if(bind(listener_socket,(struct sockaddr*)&addr,sizeof(addr)) == SOCK_ERR) {
abortt("Could not bind listener socket to port");
}
printf("Bind of socket %d to port %d successful\n", listener_socket, PORT);

return(0);
}

WARNING

The following code won't work if you run it too soon after the last run. You might need to wait a minute or so to get it to run. There is a way to make it able to rerun instantly. To do so, put the following code right after the call to socket() and its associated error handling and informational messages:

// MAKE IT RERUN INSTANTLY
int flag = 1;
if(setsockopt(listener_socket, SOL_SOCKET, SO_REUSEADDR, (char*)&flag, sizeof(flag)) < 0){
abortt("Could not set socket option SO_REUSEADDR!");
}
I didn't put it into the earlier examples in order to keep everything simple, but if you find yourself doing extensive troubleshooting, the preceding code can prevent the need to wait up to a minute or two between server runs.

If the bind() code and what came before it works, you'll see all info messages including the one saying the socket bound successfully, like the following:

[slitt@mydesk sockets]$ ./a.out
Starting server
Making a socket.
Listener socket handle=3
Declaring and loading sockaddr_in struct
sockaddr_in struct declared and loaded.
Binding socket 3 to port 43210
Bind of socket 3 to port 43210 successful
[slitt@mydesk sockets]$

So now you have a socket bound to a port. The next step is to convert that socket to a passive socket for listening, using the listen() function. To review, what you're listening for is clients trying to connect. The following code performs the listen() function:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <netinet/in.h>
#include <unistd.h>

#define SOCK_ERR -1
#define BUFSIZE 100
#define NUMBER_OF_CLIENTS_TO_QUEUE 5
#define PORT 43210 //change if 43210 already in use

void abortt(const char * msg){
printf("\nFATAL ERROR: %s\n", msg);
exit(1);
}

int main(int argc, char* argv[]){
printf("Starting server\n");

// MAKE A SOCKET
printf("Making a socket.\n");
int listener_socket = socket(AF_INET,SOCK_STREAM,0);
if(listener_socket == SOCK_ERR) {
abortt("Could not make a socket!");
}
printf("Listener socket handle=%d\n", listener_socket);

// DECLARE AND LOAD sockaddr_in STRUCT
printf("Declaring and loading sockaddr_in struct\n");
struct sockaddr_in addr;
addr.sin_addr.s_addr=INADDR_ANY;
addr.sin_port=htons(PORT);
addr.sin_family=AF_INET;
printf("sockaddr_in struct declared and loaded.\n");

// BIND TO PORT
printf("Binding socket %d to port %d\n", listener_socket, PORT);
if(bind(listener_socket,(struct sockaddr*)&addr,sizeof(addr)) == SOCK_ERR) {
abortt("Could not bind listener socket to port");
}
printf("Bind of socket %d to port %d successful\n", listener_socket, PORT);

// CONVERT SOCKET TO PASSIVE LISTENER
if(listen(listener_socket, NUMBER_OF_CLIENTS_TO_QUEUE) == SOCK_ERR) {
abortt("The listen() call failed");
}
printf("Listener port %d bound to port %d now listening.\n", listener_socket, PORT);

return(0);
}

If the listen() function doesn't error out, the preceding code produces the following output:

[slitt@mydesk sockets]$ ./a.out
Starting server
Making a socket.
Listener socket handle=3
Declaring and loading sockaddr_in struct
sockaddr_in struct declared and loaded.
Binding socket 3 to port 43210
Bind of socket 3 to port 43210 successful
Listener port 3 bound to port 43210 now listening.
[slitt@mydesk sockets]$

Up until this point, the program's done nothing recognizable, and certainly nothing that visably interacts with any client. That's about to change, because now we're going to run the accept() function, which is the first blocking function we've used. To review, a blocking function stops and does not continue until it receives a response. In this case, the response would be a client trying to connect. If you're worried that the program could stop for a really long time, don't. In a real production program, there are ways the program can test for sockets needing attention.

Anyway, the following code shows the use of accept():
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <netinet/in.h>
#include <unistd.h>

#define SOCK_ERR -1
#define BUFSIZE 100
#define NUMBER_OF_CLIENTS_TO_QUEUE 5
#define PORT 43210 //change if 43210 already in use

void abortt(const char * msg){
printf("\nFATAL ERROR: %s\n", msg);
exit(1);
}

int main(int argc, char* argv[]){
printf("Starting server\n");

// MAKE A SOCKET
printf("Making a socket.\n");
int listener_socket = socket(AF_INET,SOCK_STREAM,0);
if(listener_socket == SOCK_ERR) {
abortt("Could not make a socket!");
}
printf("Listener socket handle=%d\n", listener_socket);

// DECLARE AND LOAD sockaddr_in STRUCT
printf("Declaring and loading sockaddr_in struct\n");
struct sockaddr_in addr;
addr.sin_addr.s_addr=INADDR_ANY;
addr.sin_port=htons(PORT);
addr.sin_family=AF_INET;
printf("sockaddr_in struct declared and loaded.\n");

// BIND TO PORT
printf("Binding socket %d to port %d\n", listener_socket, PORT);
if(bind(listener_socket,(struct sockaddr*)&addr,sizeof(addr)) == SOCK_ERR) {
abortt("Could not bind listener socket to port");
}
printf("Bind of socket %d to port %d successful\n", listener_socket, PORT);

// CONVERT SOCKET TO PASSIVE LISTENER
if(listen(listener_socket, NUMBER_OF_CLIENTS_TO_QUEUE) == SOCK_ERR) {
abortt("The listen() call failed");
}
printf("Listener port %d bound to port %d now listening.\n", listener_socket, PORT);

// ACCEPT A CLIENT CONNECTION
printf("Looking to accept a client connection\n");
int sockaddrlen = sizeof(struct sockaddr_in);
int data_socket=accept(listener_socket,(struct sockaddr*)&addr,(socklen_t *)&sockaddrlen);
if(data_socket == SOCK_ERR){
abortt("Call to accept() failed");
}
printf("Accepted a client, data socket handle=%d.\n", data_socket);

return(0);
}

A couple things to notice in the preceding code. The accept() function requires not only the sockaddr structure to be passed as an address, but also the length of that structure must be passed as an address. This is so the function can change the length if necessary. The return value is the active socket to be used for data passing with the requesting client.

The preceding code produces the following output:

[slitt@mydesk sockets]$ ./a.out
Starting server
Making a socket.
Listener socket handle=3
Declaring and loading sockaddr_in struct
sockaddr_in struct declared and loaded.
Binding socket 3 to port 43210
Bind of socket 3 to port 43210 successful
Listener port 3 bound to port 43210 now listening.
Looking to accept a client connection

Notice the output does not return to the command prompt. The call to accept() has blocked, and won't proceed until a socket client tries to connect to this server program. Rather than coding a socket client right now, we'll just use telnet, which is a very generic socket client. A simplified telnet command looks like this:
telnet ip_address port_number
OK, let's do it:

[slitt@mydesk sockets]$ telnet 127.0.0.1 43210
Trying 127.0.0.1...
Connected to mydesk.domain.cxm (127.0.0.1).
Escape character is '^]'.
Connection closed by foreign host.
[slitt@mydesk sockets]$

Fascinating! The telnet program actually found a server at localhost and port 43210. If it hadn't found something, it would have said "unable to connect" instead of the "Connected" message it gave. Even more fascinating is what happened on the the server, which now finishes and goes back to the command prompt. The Telnet client's connection unblocked the accept() call. The output following the telnet connection is in brown, as follows:
[slitt@mydesk sockets]$ ./a.out
Starting server
Making a socket.
Listener socket handle=3
Declaring and loading sockaddr_in struct
sockaddr_in struct declared and loaded.
Binding socket 3 to port 43210
Bind of socket 3 to port 43210 successful
Listener port 3 bound to port 43210 now listening.
Looking to accept a client connection
Accepted a client, data socket handle=4.
[slitt@mydesk sockets]$

That's it. You've just proven the concept. Your socket server has responded to a socket client. As one last addition to this Hello World, let's actually have client and server exchange data.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <netinet/in.h>
#include <unistd.h>

#define SOCK_ERR -1
#define BUFSIZE 100
#define NUMBER_OF_CLIENTS_TO_QUEUE 5
#define PORT 43210 //change if 43210 already in use

void abortt(const char * msg){
printf("\nFATAL ERROR: %s\n", msg);
exit(1);
}

int main(int argc, char* argv[]){
printf("Starting server\n");

// MAKE A SOCKET
printf("Making a socket.\n");
int listener_socket = socket(AF_INET,SOCK_STREAM,0);
if(listener_socket == SOCK_ERR) {
abortt("Could not make a socket!");
}
printf("Listener socket handle=%d\n", listener_socket);

// DECLARE AND LOAD sockaddr_in STRUCT
printf("Declaring and loading sockaddr_in struct\n");
struct sockaddr_in addr;
addr.sin_addr.s_addr=INADDR_ANY;
addr.sin_port=htons(PORT);
addr.sin_family=AF_INET;
printf("sockaddr_in struct declared and loaded.\n");

// BIND TO PORT
printf("Binding socket %d to port %d\n", listener_socket, PORT);
if(bind(listener_socket,(struct sockaddr*)&addr,sizeof(addr)) == SOCK_ERR) {
abortt("Could not bind listener socket to port");
}
printf("Bind of socket %d to port %d successful\n", listener_socket, PORT);

// CONVERT SOCKET TO PASSIVE LISTENER
if(listen(listener_socket, NUMBER_OF_CLIENTS_TO_QUEUE) == SOCK_ERR) {
abortt("The listen() call failed");
}
printf("Listener port %d bound to port %d now listening.\n", listener_socket, PORT);

// ACCEPT A CLIENT CONNECTION
printf("Looking to accept a client connection\n");
int sockaddrlen = sizeof(struct sockaddr_in);
int data_socket=accept(listener_socket,(struct sockaddr*)&addr,(socklen_t *)&sockaddrlen);
if(data_socket == SOCK_ERR){
abortt("Call to accept() failed");
}
printf("Accepted a client, data socket handle=%d.\n", data_socket);

// SEND STRING TO CLIENT
printf("Sending string to client\n");
char * sendstring = "Please type something then press Enter...\n";
send(data_socket, sendstring, strlen(sendstring), 0);
printf("Sent string to client\n");

// RECEIVE FROM CLIENT
char buf[BUFSIZE];
printf("Receiving string from client\n");
int nReceived=-1;
nReceived = recv(data_socket, buf, BUFSIZE-2, 0);
printf("Received %d characters from client\n", nReceived);
buf[nReceived] = '\0';
printf("Received string=>%s<=\n", buf);

return(0);
}

The preceding code produced the following sessions:

Server session     Client (telnet) session
[slitt@mydesk sockets]$ ./a.out
Starting server
Making a socket.
Listener socket handle=3
Declaring and loading sockaddr_in struct
sockaddr_in struct declared and loaded.
Binding socket 3 to port 43210
Bind of socket 3 to port 43210 successful
Listener port 3 bound to port 43210 now listening.
Looking to accept a client connection
Accepted a client, data socket handle=4.
Sending string to client
Sent string to client
Receiving string from client
Received 7 characters from client
Received string=>steve
<=
[slitt@mydesk sockets]$
[slitt@mydesk sockets]$ telnet 127.0.0.1 43210
Trying 127.0.0.1...
Connected to mydesk.domain.cxm (127.0.0.1).
Escape character is '^]'.
Please type something then press Enter...
steve
Connection closed by foreign host.
[slitt@mydesk sockets]$

In the preceding, you see the server send the message "Please type something then press Enter". Then, in the telnet session, I typed "steve" and pressed Enter. That string showed up in the server's output. Two way communication has been done.

As mentioned throughout this article, this was a Hello World, a proof of concept impractical for the real world. It terminated after one response from the client. It could have blocked forever if a client hadn't appeared. It used telnet instead of a socket client you coded yourself. These things will be dealt with later in this magazine, but for now take pride in the fact that you just coded a socket server, and it worked.
Steve Litt is the author of the Universal Troubleshooting Process Courseware. Steve can be reached at his email address.

Going Across a Network

By Steve Litt
The preceding article accessed the server as 127.0.0.1. Obviously this cannot be done across a network. So the next step is, from the same computer (192.168.100.2), access it with the machine's IP address instead of 127.0.0.1:
telnet 192.168.100.2 43210
The following is the result:

[slitt@mydesk ~]$ telnet 192.168.100.2 43210
Trying 192.168.100.2...
Connected to mydesk.domain.cxm (192.168.100.2).
Escape character is '^]'.
Please type something then press Enter...
I like sockets
Connection closed by foreign host.
[slitt@mydesk ~]$

Now run the same command from 192.168.100.9 instead of 192.168.100.2:
[slitt@mylap2 ~]$ telnet 192.168.100.2 43210
Trying 192.168.100.2...

It hangs, and will eventually time out. The problem is that the firewall on the server's computer, 192.168.100.2, does not enable access to port 43210. Once you enable access, via TCP, to port 43210, inter-computer use works just like it works when client and server are on the same computer.

If it still doesn't work, check that your network connectivity is correct, that the network is up on both computers, and that all network cables are plugged in and/or wireless networking is working properly.

Last but not least, make sure inter-computer access works with domain names instead of IP addresses. If it works by IP address but not by domain name, you have a DNS problem to solve.
Steve Litt is the author of the Universal Troubleshooting Process Courseware. Steve can be reached at his email address.

Making a Simple Echo Server

By Steve Litt
The Hello World socket server terminates after a single data flow from server to client and vice versa. That's impractical in real life. In real life, the conversation goes on, meaning the send() and recv() calls must be in a loop. This article demonstrates an echo server, in which the server returns every string sent it. For simpler understanding, the server prepends fromserver=> to each string.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <netinet/in.h>
#include <unistd.h>

#define SOCK_ERR -1
#define BUFSIZE 100
#define NUMBER_OF_CLIENTS_TO_QUEUE 5
#define PORT 43210 //change if 43210 already in use

void abortt(const char * msg){
printf("\nFATAL ERROR: %s\n", msg);
exit(1);
}

int main(int argc, char* argv[]){
printf("Starting server\n");

// MAKE A SOCKET
printf("Making a socket.\n");
int listener_socket = socket(AF_INET,SOCK_STREAM,0);
if(listener_socket == SOCK_ERR) {
abortt("Could not make a socket!");
}
printf("Listener socket handle=%d\n", listener_socket);

// DECLARE AND LOAD sockaddr_in STRUCT
printf("Declaring and loading sockaddr_in struct\n");
struct sockaddr_in addr;
addr.sin_addr.s_addr=INADDR_ANY;
addr.sin_port=htons(PORT);
addr.sin_family=AF_INET;
printf("sockaddr_in struct declared and loaded.\n");

// BIND TO PORT
printf("Binding socket %d to port %d\n", listener_socket, PORT);
if(bind(listener_socket,(struct sockaddr*)&addr,sizeof(addr)) == SOCK_ERR) {
abortt("Could not bind listener socket to port");
}
printf("Bind of socket %d to port %d successful\n", listener_socket, PORT);

// CONVERT SOCKET TO PASSIVE LISTENER
if(listen(listener_socket, NUMBER_OF_CLIENTS_TO_QUEUE) == SOCK_ERR) {
abortt("The listen() call failed");
}
printf("Listener port %d bound to port %d now listening.\n", listener_socket, PORT);

// ACCEPT A CLIENT CONNECTION
printf("Looking to accept a client connection\n");
int sockaddrlen = sizeof(struct sockaddr_in);
int data_socket=accept(listener_socket,(struct sockaddr*)&addr,(socklen_t *)&sockaddrlen);
if(data_socket == SOCK_ERR){
abortt("Call to accept() failed");
}
printf("Accepted a client, data socket handle=%d.\n", data_socket);

// CONDUCT CONVERSATION
while(1){
char buf[BUFSIZE];
int nReceived=-1;
nReceived = recv(data_socket, buf, BUFSIZE-2, 0);
printf("nReceived=%d\n", nReceived);
buf[nReceived] = '\0';
send(data_socket, "fromserver=>", strlen("fromserver=>"), 0);
send(data_socket, buf, strlen(buf), 0);
}
return(0);
}

Run the preceding server code, telnet in, and type in strings:
[slitt@mydesk ~]$ telnet 192.168.100.2 43210
Trying 192.168.100.2...
Connected to mydesk.domain.cxm (192.168.100.2).
Escape character is '^]'.
one
fromserver=>one
two
fromserver=>two
three
fromserver=>three

It works. There's just one problem. From the telnet session, you cannot end the communication. Ctrl+C won't do it. Ctrl+D won't do it. You must either kill the server (with Ctrl+C) or kill telnet with a kill command. Interestingly enough, within the server, the diagnostic print showing the number of characters received shows two more than the length of the string typed in from Telnet. Telnet sends a crlf at the end of each string.

One way to quit from Telnet is to use an escape character (perhaps tilde (~)) to stop communication with the server, and then quit Telnet. Once Telnet ends, the server terminates:
[slitt@mydesk ~]$ telnet -e~ 192.168.100.2 43210
Telnet escape character is '~'.
Trying 192.168.100.2...
Connected to mydesk.domain.cxm (192.168.100.2).
Escape character is '~'.
one
fromserver=>one
two
fromserver=>two
three
fromserver=>three
four
fromserver=>four
~
telnet> quit
Connection closed.
[slitt@mydesk ~]$

Once Telnet terminates, so does the server. That's probably not what you want. Notice that the send and receive loop is programmed not to terminate (while(1)). That means that the server termination occurs from an unhandled error condition, which again is probably not what you want.

A few diagnostic prints prove that the program aborts when trying to perform a send() on a socket closed by Telnet termination. It's easy enough to not perform the send() and then to terminate the loop. You simply use the number of received characters as the loop variable, and include a continue() statement to prevent execution of the send() after socket termination.

Then, to make sure you can reconnect from this telnet session or another one, put the accept() and the entire send/recv loop into an outer forever loop:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <netinet/in.h>
#include <unistd.h>

#define SOCK_ERR -1
#define BUFSIZE 100
#define NUMBER_OF_CLIENTS_TO_QUEUE 5
#define PORT 43210 //change if 43210 already in use

void abortt(const char * msg){
printf("\nFATAL ERROR: %s\n", msg);
exit(1);
}

int main(int argc, char* argv[]){
printf("Starting server\n");

// MAKE A SOCKET
printf("Making a socket.\n");
int listener_socket = socket(AF_INET,SOCK_STREAM,0);
if(listener_socket == SOCK_ERR) {
abortt("Could not make a socket!");
}
printf("Listener socket handle=%d\n", listener_socket);

// DECLARE AND LOAD sockaddr_in STRUCT
printf("Declaring and loading sockaddr_in struct\n");
struct sockaddr_in addr;
addr.sin_addr.s_addr=INADDR_ANY;
addr.sin_port=htons(PORT);
addr.sin_family=AF_INET;
printf("sockaddr_in struct declared and loaded.\n");

// BIND TO PORT
printf("Binding socket %d to port %d\n", listener_socket, PORT);
if(bind(listener_socket,(struct sockaddr*)&addr,sizeof(addr)) == SOCK_ERR) {
abortt("Could not bind listener socket to port");
}
printf("Bind of socket %d to port %d successful\n", listener_socket, PORT);

// CONVERT SOCKET TO PASSIVE LISTENER
if(listen(listener_socket, NUMBER_OF_CLIENTS_TO_QUEUE) == SOCK_ERR) {
abortt("The listen() call failed");
}
printf("Listener port %d bound to port %d now listening.\n", listener_socket, PORT);

while(1){
// ACCEPT A CLIENT CONNECTION
printf("Looking to accept a client connection\n");
int sockaddrlen = sizeof(struct sockaddr_in);
int data_socket=accept(listener_socket,(struct sockaddr*)&addr,(socklen_t *)&sockaddrlen);
if(data_socket == SOCK_ERR){
abortt("Call to accept() failed");
}
printf("Accepted a client, data socket handle=%d.\n", data_socket);

// CONDUCT CONVERSATION
int nReceived=-1;
while(nReceived){
char buf[BUFSIZE];
nReceived = recv(data_socket, buf, BUFSIZE-2, 0);
printf("nReceived=%d\n", nReceived);
if(nReceived==0) continue;
buf[nReceived] = '\0';
send(data_socket, "fromserver=>", strlen("fromserver=>"), 0);
send(data_socket, buf, strlen(buf), 0);
}
}
return(0);
}

The preceding code supports the following Telnet session. Notice that you can back out of Telnet and go back in again, and the server is running:
[slitt@mydesk ~]$ telnet -e~ 192.168.100.2 43210
Telnet escape character is '~'.
Trying 192.168.100.2...
Connected to mydesk.domain.cxm (192.168.100.2).
Escape character is '~'.
1
fromserver=>1
2
fromserver=>2
3
fromserver=>3
~
telnet> quit
Connection closed.
[slitt@mydesk ~]$ telnet -e~ 192.168.100.2 43210
Telnet escape character is '~'.
Trying 192.168.100.2...
Connected to mydesk.domain.cxm (192.168.100.2).
Escape character is '~'.
4
fromserver=>4
5
fromserver=>5
6
fromserver=>6
~
telnet> quit
Connection closed.
[slitt@mydesk ~]$

One might complain that you can't shut down the server from the telnet client, but in fact that's how things should work. A server should be shut down on the server's computer only.

However, the preceding supports only one client at a time. Can you imagine if a webserver could serve only one visitor at a time? Read on...
Steve Litt is the author of the Troubleshooting Techniques of the Successful Technologist. Steve can be reached at his email address.

Multiple Clients Through Forking

By Steve Litt
Simultaneously supporting multiple clients presents the following challenges:
The data socket per client requirement is handled by an array of sockets, or some other data structure to support several data sockets. You can use an array and a maximum, or an array that can be reallocated, or a binary tree of handles, or whatever.

The requirement not to wait on idle clients means preventing your program from getting stopped by blocking operations. There are three common ways to do this:
  1. Use nonblocking I/O.
  2. Fork a process for each data socket.
  3. Use the select() statement to service only ready sockets.
#2 and #3 are easier. This article discusses #2.

The forking option is conceptually simple. Every time an accept() yields a new data socket (meaning a new client), the process is forked. The child handles the data thread, and the parent performs another accept() to continue listening for new client connections. No need to track connections. It's very straightforward. The one nonobvious element is zombie prevention, requiring the following to be put in the program to prevent children from becoming zombies:
signal(SIGCHLD, SIG_IGN);
There are better, more reliable and more versatile methods of preventing zombies, but the preceding is good enough for a demonstration program.

So we'll build a multiclient echo server using fork() to create a new process for each client and its associated data socket. In that way, each communication is between one pair of programs, so other clients can be ignored as if they didn't exist. This is conceptually simple.

To do this, the program is essentially the same through the call to listen(). Then the signal(SIGCHLD,SIG_IGN); command is given to prevent zombies, and then an infinite loop calling accept() is run. In each loop iteration, after accept() returns a data socket, fork() is called to create a child process. The child calls subroutine trade_data() to carry out communication between the server and the particular client, while the parent iterates back to accept(), looking for another client.

This programs listing follows:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <netinet/in.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <errno.h>

#define SOCK_ERR -1
#define BUFSIZE 100
#define NUMBER_OF_CLIENTS_TO_QUEUE 5
#define PORT 43210 //change if 43210 already in use

void abortt(const char * msg){
printf("\nFATAL ERROR: %s\n", msg);
exit(1);
}

void trade_data(int data_socket){
char msgbuffer[400];

// CONDUCT CONVERSATION
int nReceived=-1;
while(nReceived){
char buf[BUFSIZE];
nReceived = recv(data_socket, buf, BUFSIZE-2, 0);
printf("nReceived=%d\n", nReceived);
if(nReceived==0) continue;
buf[nReceived] = '\0';
sprintf(msgbuffer, "fromserver#%d=>", data_socket);
send(data_socket, msgbuffer, strlen(msgbuffer), 0);
send(data_socket, buf, strlen(buf), 0);
}
}

int main(int argc, char* argv[]){
pid_t pid;
int data_socket;
printf("Starting server\n");

// MAKE A SOCKET
printf("Making a socket.\n");
int listener_socket = socket(AF_INET,SOCK_STREAM,0);
if(listener_socket == SOCK_ERR) {
abortt("Could not make a socket!");
}
printf("Listener socket handle=%d\n", listener_socket);

// MAKE IT RERUN INSTANTLY
int flag = 1;
if(setsockopt(listener_socket, SOL_SOCKET, SO_REUSEADDR, (char*)&flag, sizeof(flag)) < 0){
abortt("Could not set socket option SO_REUSEADDR!");
}

// DECLARE AND LOAD sockaddr_in STRUCT
printf("Declaring and loading sockaddr_in struct\n");
struct sockaddr_in addr;
addr.sin_addr.s_addr=INADDR_ANY;
addr.sin_port=htons(PORT);
addr.sin_family=AF_INET;
printf("sockaddr_in struct declared and loaded.\n");

// BIND TO PORT
printf("Binding socket %d to port %d\n", listener_socket, PORT);
if(bind(listener_socket,(struct sockaddr*)&addr,sizeof(addr)) == SOCK_ERR) {
abortt("Could not bind listener socket to port");
}
printf("Bind of socket %d to port %d successful\n", listener_socket, PORT);

// CONVERT SOCKET TO PASSIVE LISTENER
if(listen(listener_socket, NUMBER_OF_CLIENTS_TO_QUEUE) == SOCK_ERR) {
abortt("The listen() call failed");
}
printf("Listener port %d bound to port %d now listening.\n", listener_socket, PORT);

// PREVENT CHILD PROCESSES FROM BECOMING ZOMBIES
signal(SIGCHLD, SIG_IGN);

while(1){
printf("Looking to accept a client connection\n");
int sockaddrlen = sizeof(struct sockaddr_in);
data_socket=accept(listener_socket,(struct sockaddr*)&addr,(socklen_t *)&sockaddrlen);
if(data_socket == SOCK_ERR){
abortt("Call to accept() failed");
}
printf("Accepted a client, data socket handle=%d.\n", data_socket);
pid = fork();
if(pid == 0){ //child process
trade_data(data_socket);
exit(0);
}
else if(pid < 0){ //fork error
abortt("Failed to fork!");
}
else { //parent
printf("Successfully forked process %d to handle data socket %d\n", pid, data_socket);
}
}
return(0);
}

The following triplet sessions show the results to three different clients:
[slitt@mydesk ~]$ telnet -e~ 192.168.100.2 43210
Telnet escape character is '~'.
Trying 192.168.100.2...
Connected to mydesk.domain.cxm (192.168.100.2).
Escape character is '~'.
1
fromserver#4=>1
2
fromserver#4=>2
3
fromserver#4=>3
~
telnet> quit
Connection closed.
[slitt@mydesk ~]$
[slitt@mydesk ~]$ telnet -e~ 192.168.100.2 43210
Telnet escape character is '~'.
Trying 192.168.100.2...
Connected to mydesk.domain.cxm (192.168.100.2).
Escape character is '~'.
a
fromserver#5=>a
b
fromserver#5=>b
c
fromserver#5=>c
~
telnet> quit
Connection closed.
[slitt@mydesk ~]$
[slitt@mydesk ~]$ telnet -e~ 192.168.100.2 43210
Telnet escape character is '~'.
Trying 192.168.100.2...
Connected to mydesk.domain.cxm (192.168.100.2).
Escape character is '~'.
I
fromserver#6=>I
II
fromserver#6=>II
III
fromserver#6=>III
~
telnet> quit
Connection closed.
[slitt@mydesk ~]$

The preceding show each client first running and then sending one string. Then the clients alternate sending a second string, and then a third, and then alternate disconnecting. It should be noted that when all three disconnect, the server is still running, and yet another client can connect.

If your situation allows multiple processes for multiple clients, and if you take a few precautions to prevent orphans if the parent terminates, and if there aren't thread-safety considerations, this is a good, easy and clean way for a socket server to handle multiple clients.
Steve Litt is the author of the Troubleshooting Techniques of the Successful Technologist. Steve can be reached at his email address.

Multiple Clients Via Select()

By Steve Litt
What if you don't want the multiple processes required by each client getting its own process. Another possibility is to use the select() function to service only ready sockets in order to manage multiple clients. This will also require some sort of data structure -- an array, a tree, a linked list -- something to manage the multitude of data sockets. It's not as easy as forking off processes, but this article explains how to do it.

Simultaneously supporting multiple clients via select() presents the following challenges:
What's confusing about the data structure is that the GNU library gives you almost enough to do the job with the fd_set type. However, each socket of interest must be set in the fd_set type variable before the call to select(), whereas each socket not of interest must be unset. Therefore, we must keep explicit track of the sockets of interest.

To do that we'll use a simple array, because this is a training exercise. In a real app we'd probably have some sort of dynamic structure that could accommodate any number of data sockets, but we'll go simple here. To make things even simpler (thought totally unacceptable in a production environment), we'll simply abort the program if the maximum sockets are exceeded. However, the good news is that as clients disconnect, their representations in the array are set to 0 so they can be reused by the next client that logs on. The array in the server looks like this:
int dataSocketArray[MAXDATASOCKETS];
New clients are added to the array like this:
int firstZeroDataSocket;
for(firstZeroDataSocket=0; firstZeroDataSocket < MAXDATASOCKETS; firstZeroDataSocket++)
if(dataSocketArray[firstZeroDataSocket] == 0)
break;
if(firstZeroDataSocket == MAXDATASOCKETS)
abortt("Maximum data sockets reached!");
dataSocketArray[firstZeroDataSocket] = data_socket;
When a socket logs off, its place in the array is zeroed like this:
if(!respond_to_data(i))
for(j=0; j < MAXDATASOCKETS; j++)
if(dataSocketArray[j] == i)
dataSocketArray[j] = 0;
All this sequential lookup is very inefficient, but given the usual low number of file and socket handles, it's lightning fast. On loaded systems with large numbers of handles, you could make an array of 65535 elements and have the subscript represent the handle number, in which case lookup would be almost instantaneous.

select() and its Friends

As mentioned, select() function tells you which sockets are ready to yield information and therefore won't be blocked. The way it's done is not obvious. The definition of select() looks like this:
int select(int nfds, fd_set *restrict readfds, fd_set *restrict writefds, fd_set *restrict errorfds, struct timeval *restrict timeout);
The integer it returns is the number of  sockets ready to yield information, so if it's zero there's no socket reading to be done. The arguments are as follows:

int nfds
This should be set to the highest fd to check plus one. So if the highest data socket descriptor is 14, this should be set to 15. You can also think of it this way: If the highest fd is 14, you need to check from 0 through 14, so the number of fds (nfds) is 15.
fd_set *restrict readfds
The set of socket fds to be checked for pending reads. If it's NULL, no check is made.
fd_set *restrict writefds
The set of socket fds to be checked for readiness to write. If it's NULL, as it is in this server, no check is made.
fd_set *restrict errorfds
The set of socket fds to be checked for pending error conditions. If it's NULL, as it is in this server, no check is made.
struct timeval *restrict timeout
The select() function returns immediately if there are sockets needing service. If there are no sockets needing service upon entry to select(), select() waits the timeout period to see if a socket will require service. The instant any socket requires service, select() returns. The struct timeval structure looks like this:
struct timeval{
time_t tv_sec Seconds.
suseconds_t tv_usec Microseconds.
}
It's set to 5 seconds like this:
struct timeval timeout = {5,0}; 

In the echo server, the fd_set arguments for write and error are set to NULL, leaving only the one for read. fd_set variables are handled with four macros:
Macro What it does
FD_ZERO(fd_set * set)
This "instantiates" an fd_set variable to the NULL set. Do this before calling select().
FD_SET(int fd, fd_set * set)
This sets the slot for the integer fd to 1 within the fd_set variable. Because select() works by setting any fd of a working socket back to 0 if it is not pending, before calling select() all fds of interest must be set to 1 with FD_SET. However, if you set an fd that is not a valid socket or file, the select() will fail and return -1, which you should test for with perror(), because it's a hard to find error.
FD_ISSET(int fd, fd_set * set)
This tests whether the fd fd is set. Because select() sets any blocked fds to 0, fd must have been set to 1 with FD_SET() before each invocation of select(). If FD_ISSET() returns 1 for a given fd, that fd can be read without blocking.
FD_CLR(int fd, fd_set * set)
This removes the fd from the set under consideration. Things seem to work OK without it, but if a data socket disconnects, it's probably a good idea to run FD_CLR() on that data socket.

The bottom line is this: Armed with select(), FD_ZERO(), FD_SET(), FD_ISSET() and FD_CLR(), you can know which sockets are ready for reading, whether they're listener sockets or data sockets.

The following is a select() assisted implementation of a multiclient echo server:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <netinet/in.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <errno.h>
#include <assert.h>

#define SOCK_ERR -1
#define BUFSIZE 100
#define NUMBER_OF_CLIENTS_TO_QUEUE 5
#define PORT 43210 //change if 43210 already in use

#define MAXDATASOCKETS 20

int dataSocketArray[MAXDATASOCKETS];

void abortt(const char * msg){
printf("\nFATAL ERROR: %s\n", msg);
exit(1);
}

int respond_to_data(int data_socket){
char msgbuffer[400];

// CONDUCT CONVERSATION
int nReceived=-1;
char buf[BUFSIZE];
printf("About to receive on handle %d\n", data_socket);
nReceived = recv(data_socket, buf, BUFSIZE-2, 0);
printf("nReceived=%d\n", nReceived);
if(nReceived > 0){
buf[nReceived] = '\0';
sprintf(msgbuffer, "fromserver#%d=>", data_socket);
send(data_socket, msgbuffer, strlen(msgbuffer), 0);
send(data_socket, buf, strlen(buf), 0);
printf("Data socket %d said: %s\n", data_socket, buf);
}
else if(nReceived == 0){
printf("Data socket %d disconnected.\n", data_socket);
nReceived = recv(data_socket, buf, BUFSIZE-2, 0);
}
else {
printf("Recv() error on socket %d\n", data_socket);
}
return(nReceived);
}

int calc_maxfd(int *arr, int listener_fd){
int i;
int maxfd=listener_fd;
for(i=0; i < MAXDATASOCKETS; i++)
if(arr[i] > maxfd)
maxfd = arr[i];
return(maxfd);
}


int main(int argc, char* argv[]){
int data_socket;
printf("Starting server\n");

// MAKE A SOCKET
printf("Making a socket.\n");
int listener_socket = socket(AF_INET,SOCK_STREAM,0);
if(listener_socket == SOCK_ERR) {
abortt("Could not make a socket!");
}
printf("Listener socket handle=%d\n", listener_socket);

// MAKE IT RERUN INSTANTLY
int flag = 1;
if(setsockopt(listener_socket, SOL_SOCKET, SO_REUSEADDR, (char*)&flag, sizeof(flag)) < 0){
abortt("Could not set socket option SO_REUSEADDR!");
}

// DECLARE AND LOAD sockaddr_in STRUCT
printf("Declaring and loading sockaddr_in struct\n");
struct sockaddr_in addr;
addr.sin_addr.s_addr=INADDR_ANY;
addr.sin_port=htons(PORT);
addr.sin_family=AF_INET;
printf("sockaddr_in struct declared and loaded.\n");

// BIND TO PORT
printf("Binding socket %d to port %d\n", listener_socket, PORT);
if(bind(listener_socket,(struct sockaddr*)&addr,sizeof(addr)) == SOCK_ERR) {
abortt("Could not bind listener socket to port");
}
printf("Bind of socket %d to port %d successful\n", listener_socket, PORT);

// CONVERT SOCKET TO PASSIVE LISTENER
if(listen(listener_socket, NUMBER_OF_CLIENTS_TO_QUEUE) == SOCK_ERR) {
abortt("The listen() call failed");
}
printf("Listener port %d bound to port %d now listening.\n", listener_socket, PORT);

// PREVENT CHILD PROCESSES FROM BECOMING ZOMBIES
signal(SIGCHLD, SIG_IGN);

// INITIALIZE FOR THE SELECT LOOP
int maxfd = listener_socket;
fd_set set;
int i;
for(i=0; i < MAXDATASOCKETS; i++){
dataSocketArray[i] = 0;
}

// REPEATEDLY RUN SELECT
while(1){
struct timeval timeout = {5,0}; // 5 second timeout
FD_ZERO(&set);
for(i=0; i < MAXDATASOCKETS; i++)
if(dataSocketArray[i] != 0)
FD_SET(dataSocketArray[i], &set);
FD_SET(listener_socket, &set);

// DIAGNOSTIC PRINT FOLLOWS, COMMENT IN FOR DEBUGGING
// printf("dia about to run select(), maxfd=%d, listener_socket=%d\n", maxfd, listener_socket);

// CHECK READY SOCKETS
int numReady = select(maxfd+1, &set, NULL, NULL, &timeout);
if(numReady < 0) perror("select returns negative");
printf("dia after select, numReady=%d, maxfd=%d, set=:\n", numReady, maxfd);
for(i=0; i <= maxfd; i++){
printf("Socket %d is %d\n", i, FD_ISSET(i, &set));
}
if (numReady < 0) abortt("Error on select()");
if (numReady > 0){
// FIND NEXT 0 IN dataSocketArray
int firstZeroDataSocket;
for(firstZeroDataSocket=0; firstZeroDataSocket < MAXDATASOCKETS; firstZeroDataSocket++)
if(dataSocketArray[firstZeroDataSocket] == 0)
break;
if(firstZeroDataSocket == MAXDATASOCKETS)
abortt("Maximum data sockets reached!"); // Change this for production code

// CHECK LISTENER SOCKET
if (FD_ISSET(listener_socket, &set)){
// DERIVE NEW DATA SOCKET
int sockaddrlen = sizeof(struct sockaddr_in);
data_socket=accept(listener_socket,(struct sockaddr*)&addr,(socklen_t *)&sockaddrlen);
if(data_socket == SOCK_ERR){
abortt("Call to accept() failed");
}
printf("New client. Data socket handle=%d.\n", data_socket);
if(data_socket > maxfd) maxfd = data_socket;
FD_SET(data_socket, &set);
dataSocketArray[firstZeroDataSocket] = data_socket;
} // end of checking listener socket
else {
// SERVE ALL READY DATA SOCKETS
int i, j;
for(i=0; i <= maxfd; i++){
if(i != listener_socket){
if(FD_ISSET(i, &set)){
if(!respond_to_data(i)){
FD_CLR(i, &set);
for(j=0; j < MAXDATASOCKETS; j++)
if(dataSocketArray[j] == i)
dataSocketArray[j] = 0;
if(i == maxfd) maxfd = calc_maxfd(dataSocketArray, listener_socket);
}
}
}
} // end of serving all ready data sockets

printf("Array=");
for(i=0; i < MAXDATASOCKETS; i++) printf("|%d|", dataSocketArray[i]);
printf("\n");
}
} // end of sockets ready test

} // end of select() loop

return(0);
}

Steve Litt is the author of Twenty Eight Tales of Troubleshooting.   Steve can be reached at his email address.

Building a TCP Client

By Steve Litt
Did you notice that Telnet always sent a carriage cr/lf at the end of every string you typed in? Might there be situations where you don't want it to do that? Might there be situations where you want prompts to assist the user in typing in material? Might there be situations where you want the client to be menu driven, or graphical? Telnet's a great all around socket client for testing, but sooner or later you need to build yourself a real client.

Here's the series of steps a socket client must perform:
  1. Make the socket data structure (socket())
  2. Declare and fill in the struct sock_addr_in structure with the server's information
  3. Connect to the server (connect())
  4. Send and receive data to and from the client (send() and recv())
Notice that unlike a server, it doesn't need to bind to a port number. The client will find a port to use. If you want to guarantee that a certain port gets used, and/or if you want to guarantee that only one copy gets run, you can bind it to a port number. It also doesn't need to accept -- it proactively tries to connect when you use the connect() function, and the server accepts.

The following code is an ultra-simple client for an echo server -- either fork()ed or select()ed:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/socket.h>
#include <arpa/inet.h>

#define SOCK_ERR -1
#define SERVER_PORT 43210 //change if 43210 already in use
#define SERVER_IP_ADDRESS "192.168.100.2" //change to address of server

#define MAXDATASOCKETS 20


void abortt(const char * msg){
printf("\nFATAL ERROR: %s\n", msg);
exit(1);
}


int main(int argc, char* argv[]){
char buf[1000];
printf("Starting client\n");

// MAKE A SOCKET
printf("Making a socket.\n");
int data_socket = socket(AF_INET,SOCK_STREAM,0);
if(data_socket == SOCK_ERR) {
abortt("Could not make a socket!");
}
printf("Listener socket handle=%d\n", data_socket);

// DECLARE AND LOAD sockaddr_in STRUCT
printf("Declaring and loading sockaddr_in struct\n");
struct sockaddr_in addr;
addr.sin_addr.s_addr=inet_addr(SERVER_IP_ADDRESS);
addr.sin_port=htons(SERVER_PORT);
addr.sin_family=AF_INET;
printf("sockaddr_in struct declared and loaded.\n");

// CONNECT TO SERVER
int result = connect(data_socket, (struct sockaddr *) &addr, sizeof(struct sockaddr_in));
if(result == SOCK_ERR)
abortt("ERROR: Failure in connect()");

// DIALOG LOOP
while(1){
printf(">");

// ACQUIRE STRING FROM USER AND TWEAK
fgets(buf, sizeof(buf), stdin);
buf[sizeof(buf)-1] = '\0';
buf[strlen(buf)-1] = '\0';
char *pch = buf + strlen(buf) - 3;
if(!strcmp(pch, "~~~")){
*pch='\n';
*(pch+1)='\0';
}

if(strlen(buf) > 0){ // 0 bytes is no send at all!:
send(data_socket, buf, strlen(buf), 0);

int nReceived = recv(data_socket, buf, sizeof(buf)-2, 0);
// printf("dia nReceived=%d\n", nReceived);
if(nReceived > 0){
buf[nReceived] = '\0';
printf("%s\n", buf);
}
else if(nReceived == 0){
printf("Data socket %d disconnected.\n", data_socket);
nReceived = recv(data_socket, buf, sizeof(buf)-2, 0);
}
else {
printf("Recv() error on socket %d\n", data_socket);
}
}

}

return(0);
}

The preceding code allows the user to type in a string, and when the user presses Enter it sends the string, but without a newline. If you want a newline, put three tildes on the end of the string and those will be replaced by a newline. Notice also that the way to stop this client is by pressing Ctrl+C. Crude but effective.

The preceding code creates the following session when run against the select() enabled multiclient server:

[slitt@mydesk sockets]$ ./a.out
Starting client
Making a socket.
Listener socket handle=3
Declaring and loading sockaddr_in struct
sockaddr_in struct declared and loaded.
>one
fromserver#10=>one
>two
fromserver#10=>two
>three
fromserver#10=>three
>four with newline~~~
fromserver#10=>four with newline

>five with just 2 tildes~~
fromserver#10=>five with just 2 tildes~~
>
[slitt@mydesk sockets]$

Steve Litt is the author of the Universal Troubleshooting Process Courseware. Steve can be reached at his email address.

These Are Educational Exercises

By Steve Litt
Looking at the several servers and the client discussed in this magazine, you'll quickly realize these would fail miserably in a real production environment.

All the echo servers are timing dependent. They define "a string" as what comes in between recv() or select() calls. This works as expected with a human typist, but fails miserably when pasted in, as follows:

[slitt@mydesk sockets]$ ./a.out
Starting client
Making a socket.
Listener socket handle=3
Declaring and loading sockaddr_in struct
sockaddr_in struct declared and loaded.
>one
two
three
four
five
six
seven
eight
nine
ten~~~
~~~
fromserver#12=>one
>fromserver#12=>
>two
>fromserver#12=>three
>fromserver#12=>four
>fromserver#12=>five
>fromserver#12=>
>sixseven
>fromserver#12=>eight
>fromserver#12=>nine
>fromserver#12=>ten

>
[slitt@mydesk sockets]$
    As you can see, text numbers one through ten were pasted in. ten was terminated with three tildes, translating into a newline. There was also a line with three tildes after the ten. The entire paste showed up before the first return from the server. Worse, two, six and seven showed some timing anomolies.

The results would have been much worse if the server had tens of clients constantly sending information. In that case all clients would get back extremely unexpected forms of echos.

Assuming the desire was to echo a line only after a newline was received, the real way to do it would have been to scan the buffer for newlines after each recv(), and if one were found, place a null byte in its place, shoot the buffer back with send(), and then slide the remainder of the data to the beginning of the data.

With TCP sockets, it's important to understand that the data is a stream of bytes, nothing more and nothing less. It doesn't get sent as "strings", and if you want it recognized as units like "strings" or "records", delimiter bytes must be used.

However, such delimiters on the client end and parsing on the server end would have obfuscated the important socket concepts and thus would have been overkill for the purpose of an exercise.

I'm not sure about this, but I think both the fork() and select() server versions share one buffer amongst many clients. This is OK as an exercise, when one human typist much operate all clients and can't simultaneously operate multiple clients. However, in production the forked process or the respond_to_data() routine in the select() assisted server must allocate and free their own buffers to prevent interclient memory clashes.

All the servers in this magazine hard coded the port number. That makes it extremely unportable (no pun intended). When you install it on a machine that's already using port 43210, it would fail. The usual ways to do this are one or more of the following:
Once again, configurable ports would have been overkill in this educational code.

On the forked version, care must be taken to be thread safe and to make sure the parent stays alive until all its children are gone, or some way to shut it all down safely. This becomes much more important when serving tens of clients with lots of traffic.

The select() version isn't immune to real world problems. First of all, its algorithm is complex, with lots of nooks and crannies for bugs to hide. Second, when using select() in a situation with both file handles and socket handles, race conditions can develop, according to the select() documentation. In those cases, for the utmost reliability, you must use pselect().

None of the examples closed the socket. Closing the socket isn't necessary in a simple experiment with a very light load, so for simplicity I left it out. However, it's obvious that good programming habits, and probably security and reliability, require closing the socket with either close() or shutdown().

Then there may be security issues, which I can't even guess. Ask a security expert.

The bottom line is this: The material in this magazine hasn't made you a TCP socket guru, but it's given you the tools to quickly build a socket application set, after which you can refine out the gotchas.
Steve Litt is the author of the Manager's Guide to Technical Troubleshooting. Steve can be reached at his email address.

Building a TCP Proxy Server

By Steve Litt
The ports below 1000 are special. Customarily, they're meant for use by the operating system itself, and customarily they're useable only by user root. You certainly don't want all your home grown clients and servers accessing these low number ports. You also don't want a high number port attached to code that could be dangerous.

Enter the TCP Proxy. It could be a server on 43210, and a client to your pop server at port 110. It receives user input at 43210, checks and sanitizes that input, and sends safe commands, based on that input, to 110. It receives information back from 110, processes it for the user's convenience and the safety of the system, and then passes it back to the user on 43210.
Steve Litt is the author of Twenty Eight Tales of Troubleshooting.   Steve can be reached at his email address.

Building a Unix Domain Socket Server

By Steve Litt
Unix named pipes (FIFOs) are the ultimate thin interface, bringing modularity and encapsulation to a new level. Trouble is, they go just one way. Oh, you can make two of them, but there's another possibility: Unix Domain Sockets. From a high level view, a Unix Domain Socket is like a bi-directional FIFO.

Unix domain sockets resemble TCP sockets almost uncannily. If you understand TCP sockets, you understand Unix domain sockets. You can transform one to the other with a minimum amount of change. Here are some of the differences:

TCP Socket Unix Domain Socket
Binds to An IPaddress/Port combo A filename
sockaddr subtype struct sockaddr_in struct sockaddr_un
length arg in connect sizeof(struct sockaddr_in) strlen(local.sun_path) + sizeof(local.sun_family)**
access to server sockaddr_in.sin_addr.s_addr,
sockaddr_in.sin_port,
firewall rules
write permissions on bound filename +++
Reach Anywhere on network Only on localhost

** sockaddr_un looks like this:
struct sockaddr_un {
short int sun_family;
char sun_path[108];
}
After copying the filename to sun_path, the portion of sun_path not containing garbage is its strlen(). Add to that the length of sun_family. Notice also the magic number 108. This could bite you if you construct filenames on the fly in various directories.

+++ A client must be able to write to the bound filename in order to connect() to the server.

As far as I know, telnet cannot be used as a client for a Unix domain socket, which means that you must build a client in order to test your server. A simple client that enables you to send, and prints what the server sends, should suffice for a test.

The following table lists the procedures to build a server and a client for a Unix domain socket. Notice how similar these are to their TCP socket equivalents:

Server procedure Client procedure
  • Make socket (socket())
  • Declare and load sockaddr_un struct
  • Bind to filename (instead of port) (bind())
  • Convert to passive listener socket (listen())
  • Accept (accept())
  • Send and receive (send() and recv())
 
  • Make socket (socket())
  • Declare and load sockaddr_un struct, with server info
  • Connect to server (connect())
  • Send and receive (send() and recv())

For an example we'll use a trivial echo server that serves only one client and then terminates. The following Unix domain socket server code performs the server procedure listed earlier:
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#include<assert.h>
#include<sys/socket.h>
#include <sys/types.h>
#include<sys/un.h>
#include<unistd.h>

#define BUFSIZE 500

void abortt(const char * msg){
printf("\nFATAL ERROR: %s\n", msg);
exit(1);
}


int main(int argc, char *argv[]){
int listener_socket, data_socket;
struct sockaddr_un local, remote;
char buf[BUFSIZE];

printf("Starting Unix Domain Socket Server\n");

// MAKE SOCKET
listener_socket = socket(AF_UNIX, SOCK_STREAM, 0);
assert(listener_socket >= 0);
printf("Listener socket created at handle %d\n", listener_socket);

// DECLARE AND LOAD sockaddr_un STRUCT
local.sun_family = AF_UNIX;
strncpy(local.sun_path, "/tmp/mysocket", 100);
unlink(local.sun_path);

// BIND TO FILENAME INSTEAD OF TO PORT
unsigned int len = strlen(local.sun_path) + sizeof(local.sun_family);
if(bind(listener_socket, (struct sockaddr *)&local, len) < 0)
abortt("Bind failed");
printf("Listener socket %d bound to filename %s\n", listener_socket, local.sun_path);

// CONVERT TO PASSIVE LISTENER SOCKET
if(listen(listener_socket, 5) < 0)
abortt("Listen failed");
printf("Listener socket %d is now passive\n", listener_socket);

// CALL ACCEPT() AND WAIT
len = strlen(local.sun_path) + sizeof(local.sun_family);
data_socket = accept(listener_socket, (struct sockaddr*)&remote, &len);
printf("Data socket %d created from listener socket %d\n", data_socket, listener_socket);

// MAINTAIN ECHO SERVER
int numChars = 22222;
while(numChars > 0){
numChars = recv(data_socket, buf, BUFSIZE-1, 0);
if(numChars < 0){
abortt("recv error");
}
else if(numChars == 0){
printf("Client disconnected\n");
break;
}

// IF YOU GOT HERE, RECV() SENT CHARACTERS, SEND BACK
printf("Client said=>%s\n", buf);
char prompt[100];
sprintf(prompt, "fromserver#%d=>", data_socket);
send(data_socket, prompt, strlen(prompt), 0);
send(data_socket, buf, numChars, 0);
}
close(data_socket);
close(listener_socket);
return 0;
}

The following Unix domain client code performs the client procedure and has facilities so the user can input three tildes to serve as a newline:
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#include<assert.h>
#include<sys/socket.h>
#include <sys/types.h>
#include<sys/un.h>
#include<unistd.h>

#define BUFSIZE 500

void abortt(const char * msg){
printf("\nFATAL ERROR: %s\n", msg);
exit(1);
}


int main(int argc, char *argv[]){
int data_socket;
struct sockaddr_un remote;
char buf[BUFSIZE];

printf("Starting Unix Domain Socket Server\n");

// MAKE SOCKET
data_socket = socket(AF_UNIX, SOCK_STREAM, 0);
assert(data_socket >= 0);
printf("Data socket created at handle %d\n", data_socket);

// DECLARE AND LOAD sockaddr_un STRUCT
remote.sun_family = AF_UNIX;
strncpy(remote.sun_path, "/tmp/mysocket", 100); // server bound to /tmp/mysocket

// CONNECT TO SERVER
printf("Connecting to server...\n");
int len = strlen(remote.sun_path) + sizeof(remote.sun_family);
int result = connect(data_socket, (struct sockaddr *) &remote, len);
if(result < 0)
abortt("ERROR: Failure in connect()");
printf("Connected socket %d bound to filename %s\n", data_socket, remote.sun_path);


// DIALOG LOOP
while(1){
printf(">");

// ACQUIRE STRING FROM USER AND TWEAK
fgets(buf, sizeof(buf), stdin);
buf[sizeof(buf)-1] = '\0';
buf[strlen(buf)-1] = '\0';
char *pch = buf + strlen(buf) - 3;
if(!strcmp(pch, "~~~")){
*pch='\n';
*(pch+1)='\0';
}

if(strlen(buf) > 0){ // 0 bytes is no send at all!:
send(data_socket, buf, strlen(buf), 0);

int nReceived = recv(data_socket, buf, sizeof(buf)-2, 0);
// printf("dia nReceived=%d\n", nReceived);
if(nReceived > 0){
buf[nReceived] = '\0';
printf("%s\n", buf);
}
else if(nReceived == 0){
printf("Data socket %d disconnected.\n", data_socket);
break;
}
else {
perror("Recv() error");
printf("Recv() error on socket %d\n", data_socket);
abortt("Recv() error");
}
}
}
close(data_socket);
return 0;
}

The following is the result of a server and client session with the preceding code:
CLIENT SESSION     SERVER SESSION
[slitt@mydesk sockets]$ gcc -Wall udomain_client.c
[slitt@mydesk sockets]$ ./a.out
Starting Unix Domain Socket Server
Data socket created at handle 3
Connecting to server...
Connected socket 3 bound to filename /tmp/mysocket
>one
fromserver#8=>one
>two
fromserver#8=>two
>three
fromserver#8=>three
>This ends in a newline~~~
fromserver#8=>This ends in a newline

>After typing this, I will hit Ctrl+C
fromserver#8=>After typing this, I will hit Ctrl+C
>
[slitt@mydesk sockets]$
[slitt@mydesk sockets]$ ./udomain.exe 
Starting Unix Domain Socket Server
Listener socket created at handle 7
Listener socket 7 bound to filename /tmp/mysocket
Listener socket 7 is now passive
Data socket 8 created from listener socket 7
Client said=>one
Client said=>two
Client said=>three
Client said=>This ends in a newline

Client said=>After typing this, I will hit Ctrl+C
Client disconnected
[slitt@mydesk sockets]$

Unix Domain Socket Pros and Cons

What should you use -- TCP sockets or Unix domain sockets?

First things first: If right now you need clients from other computers to communicate with the server, you cannot use Unix domain sockets. Unix domain sockets communicate only with clients on the same computer as the server.

Next question: If right now the server needn't be accessed by clients from remote computers, do you think that later you'll need that remote access? If so, building it right now with TCP sockets enables your server to work both locally and remotely, so you won't need to reprogram later.

However, the fact that a Unix domain socket server cannot be accessed remotely yields some security benefits. Adding to those security benefits is the fact that client access can be controlled by user by changing ownership and permissions on the filename to which the server's listener socket is bound.

But then again, anyone with physical or ssh access to the local computer can tweak the filename's ownership if he has root access, and he can tweak the permissions of the filename if he has access as either root or the file's owner. Messing with the file's ownership and permissions is easier for a non-expert than would be tweaking with port numbers and the like.

Bottom line, unless remote access is a must, I can't advise you which way to go. Read this section and make the best decision you can. Remember that Unix domain sockets and TCP sockets are very similar and their client and server code is very similar, so if you make the "wrong" decision it's fairly easy to change, especially if you program modularly and isolate the parts related to sockets.

UDP Sockets

By Steve Litt
This magazine has gotten too long, so I'll save UDP sockets for a later issue. Suffice it to say that UDP does not set up a communication channel, but instead just throws the information out to the network, and grabs information from the network. There's no guarantee of end to end communication, nor in-order communication. Whereas TCP sockets and Unix domain sockets send and receive data a byte at a time as streams, UDP communicates in chunks called datagrams.
Steve Litt has been writing Troubleshooting Professional Magazine for ten years. Steve can be reached at his email address.

Hey, Cool!

By Steve Litt
A week after Gary's presentation, I saw an Orlando Ruby User Group presentation on a Ruby-based messaging software tool called Starling. The idea is that a message producer sends out a message through a message queue, and lots of message consumers compete to grab and service that message. In so doing, the message producer can offload non-time-critical work to other resources in order to continue interacting with the user at lightning speed, and also distributing the load. This yields benefits in performance, scaling and graceful degradation (doesn't crash).

Hey cool, but what if you're not using Ruby with Starling?

Starling Sockets
Message producer Socket server
Message consumer Socket client
Message queue Queue in front of socket server

My understanding is that Starling isn't good at having the message consumer send information back to the message producer. Obviously, sockets are great at that.

For instance, take my prime number generation page at http://www.troubleshooters.com/codecorn/primenumbers/primenumbers.htm, and perhaps more specifically the article on prime-finding clusters at http://www.troubleshooters.com/codecorn/primenumbers/primenumbers.htm#_The_Cluster_Solution. Imagine each computer having the software for the central computer (lower prime array, logistics control etc) and the satellite computers (scan range X for primes and write out the primes to disk).

Now, the central computer sends out message "somebody check 1 billion to 2 billion". Satellite computer R is free, so it sends back "OK, it's mine" after which the central computer says "You've got it" and marks it checked out. Along with the message to the satellite, the central also sends where to find the prime factor array with which to test the range, and on which computer to replicate the results of the prime scan (so if the computer crashes later, its results are backed up).

When a computer finishes, it replies back through the socket connection that it finished and is now free.

Or maybe you could do it the other way. Have each satellite with nothing to do ask the central computer "give me something to do", and the central computer markes the next task unavailable and sends that task to the requesting satellite. In either scenario, computers could be added and subtracted at will assuming adequate data redundancy.

Any many specific jobs can be programmed in a distributed manner using sockets.

Sockets are an excellent thin interface. They're the basis of most client/server applications. They can easily be used to separate data from user interface from logic.

The socket part of a client or server is easily segregated from other logic:
Diagram of socket server app


Diagram of socket client app
Think of all the things to do with this easy technology. And remember one more thing: Although this tutorial has emphasized C language programming so it won't seem like magic, these same functionalities are available in Perl, Python, Ruby, and many other languages.
Steve Litt has been writing Troubleshooting Professional Magazine for ten years. Steve can be reached at his email address.

Life After Windows: Thank You Gary

The inspiration for this month's magazine was Gary Miller's spectacular socket programming presentation at the March GoLUG meeting. Gary is a professional developer and a college instructor teaching computer game development, machine architecture, C and Assembler. He not only writes great code, he can explain it. Gary attempted to teach us socket programming in 1.5 hours, and he came surprisingly close, showing us a TCP socket server and client (echoserver). Even better, Gary spelled out the fundamentals:
Armed with that information, I had confidence I could handle sockets, resulting in this Linux Productivity Magazine issue.

To cover so much material, Gary needed to go pretty fast. Luckily, the GoLUG attendees were very bright and caught on fast. I got the impression I was the only one having trouble keeping up with the class. Then, a few days later, a fellow GoLUGger mentioned to me how great and instructive Gary's presentation was, and how he learned a ton even though the pace was fast. He then mentioned that the way Gary presented enabled faster learning in spite of the fast pace.

Bottom line: Gary gave a spectacularly effective presentation improving the knowledge and ability of most who attended.

Thank You Gary!

What does this have to do with life after Windows?

In my opinion, Gary's 1.5 hour presentation had a market value of somewhere between $200.00 and $500.00. Yeah, it was that good. But we all saw it for free, because it was a user group presentation. It's free, because we all give user group presentations and we all attend them. It's barter. I teach the group Ruby, Gary teaches socket programming, Kevin teaches rsync backup, Shawn teaches ssh keys. We all teach, we all learn, we all improve our skills.

But what does this have to life after Windows?

I've been to Windows-centric user groups. All too often the presentations are self-serving dog and pony shows by vendors. Stuff you can't use without spending hundreds or thousands just to get into the game. Stuff you have not the slightest interest in.

Even in computer language user group meetings, vendor dog and pony crop up entirely too often. Some languages cost a lot of money, and every language has its addons and specialized editors. This stuff is helpful if you want to use those products, but you're probably not going to spend money just to try them out.

Linux user groups are different. Everything presented in a Linux user group meeting is free software. You can hear about it at 7pm, download it at 10pm, and get pretty good at it by 7pm the next day.

Linux user group presentations tend to be higher quality than those in the Windows world. Windows user groups attract many lookyloos and posers. Lookyloos and posers pretty easy to impress. At a Linux user group meeting, probably half the attendees are experts looking for that one new fact you can give them, and ready to correct any misconceptions you pass on. When you present at a Linux user group, you have to prepare.

What's life after Linux? It's getting a superior education free of charge.
Life After Windows is a regular Linux Productivity Magazine column, by Steve Litt, bringing you observations and tips subsequent to Troubleshooters.Com's Windows to Linux conversion.
Steve Litt is the founder and acting president of Greater Orlando Linux User Group (GoLUG).   Steve can be reached at his email address.

GNU/Linux, open source and free software

By Steve Litt
Linux is a kernel. The operating system often described as "Linux" is that kernel combined with software from many different sources. One of the most prominent, and oldest of those sources, is the GNU project.

"GNU/Linux" is probably the most accurate moniker one can give to this operating system. Please be aware that in all of Troubleshooters.Com, when I say "Linux" I really mean "GNU/Linux". I completely believe that without the GNU project, without the GNU Manifesto and the GNU/GPL license it spawned, the operating system the press calls "Linux" never would have happened.

I'm part of the press and there are times when it's easier to say "Linux" than explain to certain audiences that "GNU/Linux" is the same as what the press calls "Linux". So I abbreviate. Additionally, I abbreviate in the same way one might abbreviate the name of a multi-partner law firm. But make no mistake about it. In any article in Troubleshooting Professional Magazine, in the whole of Troubleshooters.Com, and even in the technical books I write, when I say "Linux", I mean "GNU/Linux".

There are those who think FSF is making too big a deal of this. Nothing could be farther from the truth. The GNU General Public License, combined with Richard Stallman's GNU Manifesto and the resulting GNU-GPL License, are the only reason we can enjoy this wonderful alternative to proprietary operating systems, and the only reason proprietary operating systems aren't even more flaky than they are now. 

For practical purposes, the license requirements of "free software" and "open source" are almost identical. Generally speaking, a license that complies with one complies with the other. The difference between these two is a difference in philosophy. The "free software" crowd believes the most important aspect is freedom. The "open source" crowd believes the most important aspect is the practical marketplace advantage that freedom produces.

I think they're both right. I wouldn't use the software without the freedom guaranteeing me the right to improve the software, and the guarantee that my improvements will not later be withheld from me. Freedom is essential. And so are the practical benefits. Because tens of thousands of programmers feel the way I do, huge amounts of free software/open source is available, and its quality exceeds that of most proprietary software.

In summary, I use the terms "Linux" and "GNU/Linux" interchangably, with the former being an abbreviation for the latter. I usually use the terms "free software" and "open source" interchangably, as from a licensing perspective they're very similar. Occasionally I'll prefer one or the other depending if I'm writing about freedom, or business advantage.
Steve Litt has used GNU/Linux since 1998, and written about it since 1999. Steve can be reached at his email address.

Letters to the Editor

All letters become the property of the publisher (Steve Litt), and may be edited for clarity or brevity. We especially welcome additions, clarifications, corrections or flames from vendors whose products have been reviewed in this magazine. We reserve the right to not publish letters we deem in bad taste (bad language, obscenity, hate, lewd, violence, etc.).
Submit letters to the editor to Steve Litt's email address, and be sure the subject reads "Letter to the Editor". We regret that we cannot return your letter, so please make a copy of it for future reference.

How to Submit an Article

We anticipate two to five articles per issue. We look for articles that pertain to the GNU/Linux or open source. This can be done as an essay, with humor, with a case study, or some other literary device. A Troubleshooting poem would be nice. Submissions may mention a specific product, but must be useful without the purchase of that product. Content must greatly overpower advertising. Submissions should be between 250 and 2000 words long.

Any article submitted to Linux Productivity Magazine must be licensed with the Open Publication License, which you can view at http://opencontent.org/openpub/. At your option you may elect the option to prohibit substantive modifications. However, in order to publish your article in Linux Productivity Magazine, you must decline the option to prohibit commercial use, because Linux Productivity Magazine is a commercial publication.

Obviously, you must be the copyright holder and must be legally able to so license the article. We do not currently pay for articles.

Troubleshooters.Com reserves the right to edit any submission for clarity or brevity, within the scope of the Open Publication License. If you elect to prohibit substantive modifications, we may elect to place editors notes outside of your material, or reject the submission, or send it back for modification. Any published article will include a two sentence description of the author, a hypertext link to his or her email, and a phone number if desired. Upon request, we will include a hypertext link, at the end of the magazine issue, to the author's website, providing that website meets the Troubleshooters.Com criteria for links and that the author's website first links to Troubleshooters.Com. Authors: please understand we can't place hyperlinks inside articles. If we did, only the first article would be read, and we can't place every article first.

Submissions should be emailed to Steve Litt's email address, with subject line Article Submission. The first paragraph of your message should read as follows (unless other arrangements are previously made in writing):

Copyright (c) 2003 by <your name>. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, version  Draft v1.0, 8 June 1999 (Available at http://www.troubleshooters.com/openpub04.txt/ (wordwrapped for readability at http://www.troubleshooters.com/openpub04_wrapped.txt). The latest version is presently available at  http://www.opencontent.org/openpub/).

Open Publication License Option A [ is | is not] elected, so this document [may | may not] be modified. Option B is not elected, so this material may be published for commercial purposes.

After that paragraph, write the title, text of the article, and a two sentence description of the author.

Why not Draft v1.0, 8 June 1999 OR LATER

The Open Publication License recommends using the word "or later" to describe the version of the license. That is unacceptable for Troubleshooting Professional Magazine because we do not know the provisions of that newer version, so it makes no sense to commit to it. We all hope later versions will be better, but there's always a chance that leadership will change. We cannot take the chance that the disclaimer of warranty will be dropped in a later version. 

Trademarks

All trademarks are the property of their respective owners. Troubleshooters.Com(R) is a registered trademark of Steve Litt.

URLs Mentioned in this Issue


_