A little confusing since it appears to be rather complex.

I am guessing you think your problem resided in the SFP modules you insert into the switches. These modules seldom fail. I suspect it is a software issue with the negotiation between the switches.

I would first when you get a failure log into your switches and see if you have error messages of any kind in the log. A actual failed module should produce a different error than you would see if it detected a issue with the light coming from the remote device. I would then disable and re-enable the port on each end. This will in effect reboot just that one port. If this resolves it I would look at things like speed/duplex/ flow control settings. Sometimes the AUTO option do not work correctly.

To rule out a hardware failure you could of course swap out the module with a spare, you want to try this without rebooting the switch. These modules are hot swappable in most equipment.

It is unlikely these devices themselves would get hot. You could in theory put too much light into one but that mostly just breaks it not overheats it. Just be sure you are using the correct modules for the distance you are going. You do not want to use the extended range ones on short fiber runs.