PHP and multi_curl_select()

I’m documenting this in hope it helps someone one day.

When using multi_curl_select($handle, $timeout) to wait for the next active/completed cURL handle, the second parameter should be a float value.  When not included, PHP’s documentation says it will be 1.0 seconds.  However, if you get adventurous and think that you can set the value higher for sessions that you think will take a long time, think again.  Values higher than 4.0 seconds will have adverse effects.

I was writing an application that is expected to have long-running cURL connections.  I thought setting the timeout to 10 seconds would be acceptable.  After all, it’s a timeout, not a polling interval… right?  WRONG.  It seems that if the multi_curl_select() call has a timeout that is much longer than the request times, then you run the risk of the cURL handle being garbage collected before you even get at it with curl_multi_info_read().  This means that when you eventually get something, it will act as though every connection has timed out, but it will not have an error code because technically it didn’t timeout.

If you are a user of RollingCurl, a Google Code-hosted project that has been forked many times on GitHub, you’ll see a RollingCurl->timeout instance variable.  This is the timeout to use for curl_multi_select().  Be careful when setting it!  I’ve heavily edited my RollingCurl code to account for this and many other problems like a bad memory leak on error in callback, and adding a per-request callback feature.

There is a huge difference in multi-threaded applications between polling interval and timeout.  It seems that for this usage, the word “timeout” was a bad choice.  A timeout should be a maximum time to wait before returning, not a minimum time.

I’ve learned my lesson — my curl_multi_select() calls are now running with the default of 1.0 seconds and running fine.  Sad it took so long to find this problem.