• Please review our updated Terms and Rules here

TCP/IP Update

Most recent Ethernet hardware only gives the packet to the software if the MAC address matches or if it is a broadcast packet. Putting it into promiscuous mode is the exception.

My first attempts at apps programming with my TCP stuff were based on UDP with my own sequence numbers (packet numbers) and acks. Kind of like doing Xmodem over Ethernet. It worked, but it takes a bit of extra effort to get the benefit of a sliding window. For a small machine I wouldn't even attempt it - ACKing after each packet is fine by me.

How are you going to put the machine on Ethernet? Do you have a card for the bus on the Altair, or are you planning to design one?
 
A few years ago I connected the CS8900 to an AVR and played around with that.

I can handle writing simple kernel device drivers, TCP/IP programming for unix and windows console, but the GUI stuff just makes me loose interest.

I'm using the PacketWhacker from edtp.com.

Maybe I was wasting my time checking the MAC address on every packet. The AVR is nearly 20 times faster than the 8080 though! I had time to waste! ;)
 
Wow - I liked edtp.com. I'm going to have to browse that more.

I'm a software weenie, but I understand quite a bit of hardware. I've been looking to 'graft' Ethernet onto my non-standard PCjrs, and had been looking at something similar from embeddedethernet.com. The idea is the same - a basic 10Mb/sec Ethernet chipset with minimal I/O interfacing requirements.

Things get interesting when you move from application space down into TCP/IP stack. You can do a bare-bones TCP implementation that barely meets spec but will talk to other machines in less than 16K. But a good one with proper flow control, listen support, etc. takes a quite a bit more.

Then there is the issue of the Ethernet hardware. I'm using packet drivers so this work is done for me, but something has to initialize the hardware and service it. That's not trivial code.

Promiscuous mode can be set in the hardware .. it's interesting, but letting the hardware do the filtering is obviously much easier.
 
Getting closer to a public test ..

I've added TCP reset support recently, and now I'm 'scrubbing' the code for correct behavior, and memory leaks.

For the first public test I'm going to put out a simple 'echo server' that echoes back what you send to it. It'll have a few more goodies too to make it slightly more interesting, like a command to show you how long it has been running, a command to show the number of active connections, etc.

The idea is that if it survives for a few days and gets a few hundred connections, then I probably don't have any memory leak problems. I'm going to log the TCP/IP packets too so that I can see if my code is causing problems.
 
Getting closer to a public test ..

I've added TCP reset support recently, and now I'm 'scrubbing' the code for correct behavior, and memory leaks.

For the first public test I'm going to put out a simple 'echo server' that echoes back what you send to it. It'll have a few more goodies too to make it slightly more interesting, like a command to show you how long it has been running, a command to show the number of active connections, etc.

The idea is that if it survives for a few days and gets a few hundred connections, then I probably don't have any memory leak problems. I'm going to log the TCP/IP packets too so that I can see if my code is causing problems.

i'll be sure to connect and mess around with it for you mike.
 
Still lots to do, but I'm itching to test it. So it's out there for your connecting pleasure.

Telnet to 24.159.203.149, port 2301

That ip address is my Linux machine. Port 2301 is forwarded to my 386-40 running my homebrew TCP/IP, and a small server program. The server program handles up to 9 simultaneous users and does some simple tasks, like report the number of open sockets, free memory, etc.

If it runs through the night without corrupting anything I'll be pretty happy. And if it doesn't, I've got all sorts of trace logs running so that I can figure out what the problem is. :-(

On Windows machines you can use the built-in telnet client - the command looks like "telnet 24.159.203.149 2301". Same on Linux. You can connect multiple times from the same machine if you are inclined - the stats will change as you do it.

Edit: It's still running .. feel free to try it out and help me test it! I will edit this post again when the test is over.
 
Last edited:
Testing, testing. I'm using PuTTY + telnet from Linux simulatenously. Are there any Easter eggs for us to discover? ;-)

Once I ended a session prematurely, but the socket remained allocated for a while (as observed from the other connection). After a while, it disappeared, maybe due to a timeout.

Otherwise everything worked fine until I performed a little stress test that seems to have brought down the server... :-(

I issued the "info" command 40 times in a row. The server answered the first 12 requests, but ignored the rest. Before the stress test, "mem" reported c:a 448000 bytes left. Afterwards, it was only 1136 bytes and then the server seems to have died.

I hope I didn't cause any trouble, but it looks like you have a major memory leak if a small flood of commands consumes all the memory. Yes, I'm aware it is just a 386 but preferrably it should be able to withstand something like that.
 
Carlsson - my hero!

I know that it died from corrupted memory, and I know that it happened in the last hour or so. I was just starting to dig through the logs to find it.

I'm going to recreate it here and see where I goofed up ...

Otherwise, it had been running for over 36 hours. It didn't handle a large number of connections, but I wasn't expecting that kind of testing. :)
 
And the preliminary cause of death is bad reset handling ...

I see from your side that things were going fine, and then your machine reset the TCP/IP connection. My machine screwed up after that and sent the 'Hello' message to what should have been a dead socket. That's a problem ...

Time to debug - thanks again!

Mike
 
First time I got credits for crashing a server... I suppose if you build a BBS on top of it, you may get plenty of concurrent traffic even if the number of clients are limited, so good to stress test it at an early stage.
 
The reset processing was a red herring, but that might be a problem anyway. What OS did you connect from?

In one of your sessions the socket was established, 240 chars of data was sent, and then a FIN packet was sent to close the connection. In transit the FIN packet arrived before the data, which I handled fine. Eventually your client started resetting the connection, which was bogus behavior but I handled it just fine.

I got hosed on deallocating the memory for the receive buffer. The reset processing found a path in the code where I could deallocate the same memory twice, which is a no-no. That's what corrupted the heap and crashed everything.

I didn't recreate the bug directly, but was able to simulate it by adding an extra line of code. Sure enough, the behavior is the same ...

Time to go back and scrub the code to ensure that I don't double delete a piece of storage again.

I'll never live long enough to get a BBS on top of this. It's been a long hard year already. :)
 
I was connecting from a Debian 3.1 system, kernel 2.4.27 (P4 Xeon) but I don't think it should make a difference?

The session I ended prematurely was initialized like this:

$ telnet 24.159.203.149 2301 < batchfile

in an attempt to pipe in a batch file of commands. Obviously it is not possible to do something like this, and the session was ended. The batch file contained exactly 240 characters of data ("info"+CRLF repeated 40 times). Yet your server still was functioning relatively fine afterwards. I should have issued a "mem" command right after the prematurely ended session.
 
Oh, but it does make a difference?

I was wondering how you got those commands in that fast. It also explains why the Reset was sent.

So here is chain of events as I see it from a tcpdump log:

  1. Initial connection
  2. SYN packet back to your machine
  3. ACK back from your machine (connection is now established)
  4. FIN packet with a sequence number that is too high from your machine
  5. My machine ignores the packet and sends an ACK to tell your machine what the expected sequence number is
  6. Delayed packet with 240 chars of data from your machine arrives
  7. My machine processes that packet, because it has the correct sequence numbers
  8. My machine sends the hello message and ACKs your 240 chars of input
  9. Your machine sends a RESET on the connection with an older sequence number (which is bogus)
  10. My machine ignores the bogus RESET packet because it is out of window
  11. My machine retries sending the initial 'hello' packet because it has timed out
  12. My machine eventually times out trying to resend the packets, and kills the connection.
  13. The extra delete of the receive buffer happens here .. and things start to go bad.

Your machine should not have sent the RESET packet because my machine did not ACK the FIN packet, which came in out of order and was not processed. Which is why I call the RESET packet bogus.

However, I didn't know that you were redirecting your input into telnet - that makes things different. As soon as telnet finished sending the input, it closed the connection and didn't wait for any output! So when my machine started sending output, it was sending it to a closed connection, and hence the reset packet.

Telnet was not well behaved here - it should have waited for output before hammering the connection and closing it. It didn't even wait for my side to ACK the FIN packet, or send a FIN packet of my own. Very bad behavior, and not in spec ..

Normal telnet would not have done this - this is a side effect of redirecting stdin.

Interesting overall .. now I understand why the RESET packet came in.

If I didn't have the double delete my machine would have recovered fine, even though the network sent packets out of order and your machine didn't wait for output or the final FIN from my side. A very good test though .. I'm going to add that to my testing now.


Mike
 
Back
Top