I've got some Python batch processes that run sequentially and take up 100% of a core. I've got a 4 core machine that was begging me to use it. So I pulled out multiprocessing and wrote a little wrapper. I'm calling it coreit. Here's the code. End result was that it ran 3 times faster (only used 3 of the cores).
This probably exists elsewhere. It's pretty basic but gets the job done. Feedback is welcome.
But... This is already done in stdlib:
p = multiprocessing.Pool(4) res = p.map(slowfun, args)