A while back I posted about my work to enable support for the Microchip MCP2515 CAN controller with my Gumstix Overo. At the time I was confident of success because I was successfully receiving messages from my AVR setup. It was only recently that I discovered that in fact although reception was ok, transmission was not actually working.
Debugging the mcp251x Linux driver was an interesting exercise. Google wasn’t much help, as it seemed the few others attempting this same feat were either successful or completely unsuccessful. My situation suggested I didn’t have any electrical problems nor did I have any compatibility issues. Though resolving my problem I learned a whole lot of stuff I didn’t expect to, but it was really fun and rather educational. Maybe by sharing my diagnostics process others can learn something too.
Since I had a working receiver I was working under the impression there was some sort of issue with the way I configured the mcp251x driver. I tried upgrading to the latest version of the kernel (2.6.34) but no joy. I also looked at the latest development version of the mcp251x driver from the SocketCAN project. No new insight there at all.
From building a simple driver for the AVR chip I had a good working knowledge about how the MCP2515 works, so I figured it was really something to do with how the driver was written. It obviously wouldn’t be a generalized problem, but I had no idea what could be wrong. I started by adding a whole bunch of printk() debugging statements to better trace how the driver was called from userspace. Something odd was happening where the driver was repeatedly polling the MCP2515 chip for status. I traced what it was doing in the code and also followed the SPI conversation using my Saleae Logic logic analyzer. Nifty device by the way, highly recommend it. Turns out it didn’t really help solve the problem, but it did manage to help eliminate various false leads.
The SPI conversation and responses were perfectly normal. More printk() statements later and I was rather stumped. Then it occurred to me (at a time I wasn’t actually in front of my project) that there really should not be any polling. The driver configured itself with an interrupt line to know exactly when an event was pending. The polling I could observe over the SPI bus suggested this mechanism was not working (the status returned by the chip was correctly suggesting there was no pending events most of the time).
Turns out the default Overo build from OpenEmbedded sets up the GPIO pin for the interrupt line with a pull-down resistor. The interrupt line should normally be high then pulled low to signal a pending event. As far as I could tell, the pin was never actually being pulled high enough to be sensed by the Overo processor. I expect that the voltage level converters I’m using are to blame, failing to overcome the resistance of the pull-down resistor. This meant the Overo was seeing the GPIO pin as a low logic level all the time, leading to a constant interrupt. When this happens, the interrupt handler will fire and poll for the reason for the interrupt. Inspection of the code suggests this could starve the transmit code path from ever getting enough time to send messages. Indeed this would explain the situation I was observing. I adjusted the pin configuration in the u-boot recipe (now in my user.collection) to use a pull-up resistor instead, and now everything is fine. Because of the way that the mcp251x driver is written this should be ok.
I also discovered my previously published user.collection tarball was not matching my actual configuration. For reasons that are (again) likely due to the voltage level converters I lowered the maximum SPI rate to 500kHz. I’m using a CAN bus speed of 125kHz with a very low message frequency, so this doesn’t seem to be a problem. I was also annoyed by several of the useless messages printed to the console from the SPI driver and the mcp251x driver, so I commented them out with my own patch. Both issues are now fixed in the tarball. Download, roll into your own user.collection directory, and enjoy.