4. Byte Manipulation Functions

1. The Challenge

C defines strings as null-terminated arrays of characters. However, socket address structures (like sockaddr_in) often contain fields with null bytes (0x00) inside them (e.g., an IP address like 10.0.0.1 contains two zero bytes).

Standard C string functions (like strcpy) stop when they hit a null byte, so they cannot copy these structures correctly.
We need functions that copy a specific number of bytes, ignoring what the data actually is.

2. The Berkeley Functions

These functions originated in the Berkeley (BSD) Unix releases. While older, they are still widely used in network programming.

bzero(dest, nbytes): Writes nbytes of zeros to the destination. Initializing a socket address structure to all zeros before filling it in.
bcopy(src, dest, nbytes): Copies nbytes from source to destination.
bcmp(ptr1, ptr2, nbytes): Compares two byte strings. Returns 0 if they are identical.

3. The ANSI C Functions

These are the standard C library functions found on all modern systems.

memset(dest, value, len): Sets len bytes of memory to a specific value. This is the standard equivalent of bzero, but the arguments are swapped (the length is the third argument, not the second).
memcpy(dest, src, nbytes): Copies nbytes from source to destination. The order of arguments is dest, src, which is the reverse of bcopy (src, dest).
memcmp(ptr1, ptr2, nbytes): Compares two byte strings. Returns 0 if identical.

Note

The book uses bzero throughout its examples because it is simpler (only two arguments) and less prone to errors than memset.

4. inet* Functions

This section covers the legacy functions used to convert IPv4 addresses between text strings (e.g., “192.168.1.1”) and binary form.

inet_aton (ASCII to Network):
- Converts a dotted-decimal string into a 32-bit network byte ordered binary value.
- Verdict: This is the preferred function for IPv4-only code.
inet_addr (Deprecated):
- An older function that does the same thing but returns -1 (specifically INADDR_NONE) on error.
- Problem: Since 255.255.255.255 is the broadcast address, its binary value is all ones (which looks like -1). This function cannot distinguish between the broadcast address and an error!
- Verdict: Do not use this.
inet_ntoa (Network to ASCII):
- Converts a binary IPv4 address back into a dotted-decimal string.
- Problem: It returns a pointer to a static buffer inside the function. If you call it twice in the same printf statement, the second call will overwrite the result of the first call before it prints!
- Verdict: Useful for simple debugging, but not thread-safe or re-entrant.

5. Modern inet* Functions

This section introduces the modern functions that you should use. They work for both IPv4 and IPv6.

p stands for Presentation: The text string (what users see).
n stands for Numeric: The binary value (what the network uses).

inet_pton (Presentation to Numeric)
- Input: Address Family (AF_INET or AF_INET6), the string pointer, and a pointer to the buffer where the binary address will be stored.
- Returns:
  - 1 if successful.
  - 0 if the string was not a valid format.
  - -1 on error.
inet_ntop (Numeric to Presentation)
- Input: Address Family, pointer to the binary address, pointer to a buffer to hold the result string, and the size of that buffer.
- Buffer Size: To avoid magic numbers, the system defines constants for the maximum size of an address string:
  - INET_ADDRSTRLEN (16 bytes for IPv4)
  - INET6_ADDRSTRLEN (46 bytes for IPv6)
- Returns: A pointer to the result string (or NULL on error).

Note

Use inet_pton and inet_ntop for all new code. They are protocol-independent and safe.

The standard functions inet_ntop and inet_pton have two main flaws:

Protocol Dependence: They require you to specify the address family (AF_INET or AF_INET6) manually. This makes code harder to write if you want it to support both IPv4 and IPv6 automatically.
Pointer Arithmetic: They require you to pass a pointer to the specific address inside the socket structure (e.g., &addr.sin_addr), not the socket structure itself. This means your code has to “know” which type of structure it is dealing with.

To fix this, the authors created their own custom wrapper function called sock_ntop.

How it works: You simply pass it a pointer to a socket address structure (struct sockaddr *) and its length.
What it does: It looks inside the structure (at the sa_family field), determines if it is IPv4 or IPv6, and then formats the string accordingly.
The Format: It returns a string containing both the IP address and the Port number (e.g., 206.168.112.96:13 or [2001:db8::1]:9877).
Why it’s better: Your code doesn’t need to know if the address is IPv4 or IPv6; the function handles the details for you.

The text introduces a family of these helper functions to handle various tasks without worrying about the protocol version:

sock_bind_wild: Binds a socket to the wildcard address (IP 0.0.0.0 or ::) and an ephemeral (random) port.
sock_cmp_addr: Compares two socket address structures to see if the IP addresses are the same.
sock_cmp_port: Compares two socket address structures to see if the Port numbers are the same.
sock_get_port: returns the port number from a socket address structure.
sock_ntop_host: returns only the IP address string (no port number).
sock_set_addr: Sets the IP address in a socket structure.
sock_set_port: Sets the port number in a socket structure.
sock_set_wild: Sets the IP address to the wildcard (any interface).

Quartz 4

Explorer

4. Byte Manipulation Functions

1. The Challenge

2. The Berkeley Functions

3. The ANSI C Functions

4. inet* Functions

5. Modern inet* Functions

Graph View

Backlinks

Quartz 4

Explorer

4. Byte Manipulation Functions

1. The Challenge

2. The Berkeley Functions

3. The ANSI C Functions

4. inet* Functions

5. Modern inet* Functions

6. sock_ntop and Related Functions

Graph View

Backlinks