A Question of Lifespan

Here we started a conversation about SD Lifespan. How can we make our SBC projects run for longer? In comments I’ve seen elsewhere, people seem to think it is OK that a Pi may well fail within a year due to SD – I don’t think that is even REMOTELY acceptable unless you’re making a novelty games machine.  Read more as we came up with some great solutions…

People make a big deal about the reliability of Linux – not a great deal of use if the entire file system will come to a halt in a year…

I’d never suffered this problem until recently – my first heating system issue has appeared after more than a year’s continuous use (and that includes doing lots of experiments on the same system).  You may have seen comments in earlier blog entries about this – for the first time ever I recently suffered a dead SD on one of my Raspberry Pi projects – stone, cold dead – read only and NOTHING on the Pi or my PC could encourage the SD to write again.

But let’s digress – some of the solutions take time and effort – THIS one is simple.. https://goo.gl/meAAB8  – a DUAL SD adaptor….  if things fail – have someone flick the switch and reboot –  right – back to the plot…. (thanks Antonio)…

It has been said that some cheap SDs are not as large as they seem and as soon as you exceed use beyond their ACTUAL size – the chips become read-only.  I’ve yet to test this out but TKaiser has suggested testing all new SDs and in a previous comment has recommended SanDisk Extreme Plus.

The test program H2TESTW is widely available for free. I’m testing my first 16GB disk now – looks like it will take 20 minutes but as no user interaction is needed… time well spent.

In here you will find questions and opinions. In the comments hopefully you will find some resolution – lots of bright people read this blog and I’m hoping they have solutions rather than opinions.

If you read on the web about the subject of eMMC and SD and USB memory – it is hard to tell what is hard science and what is opinion.

For example there are blogs suggesting that instead of relying on SD, use a USB memory stick. I have trouble with this as the technology is similar. Why should a USB stick last any longer than an SD.

You’ll see reference to eMMC – there can be no doubt that eMMC (usually an internal module or chip) is usually faster than SD – but does it LAST any longer – some say yes, some say no. To be sure it is less convenient to back up compared to an SD you can simply pull out and replicate!

Then there is the hard disk. I have a natural tendency to think that a spinning disk has to be less reliable than solid state memory but every experience I have says the opposite. I could not tell you the last time a hard disk went bad on me. Of course – they tend to be more expensive – and they are very much larger than SD.

The general idea is that you can READ SD as often as you want but there is a limit sometimes described as 10,000 write cycles, sometimes describes as 10 times that amount. I suspect the latter and that there is just a lot of old information out there.

Then there is WEAR LEVELLING wherein some SDs have a chip inside that helps prevent a single location being written to, too many times – knowledge on this seems to be akin to witchcraft. WHICH manufacturers use this in WHICH SDs and HOW effective is it? I’ve not found a single source of information on the subject that is up to date and verified.

Today I read about putting some directories into RAM.

In the /etc/fstab file you can add for example

tmpfs /var/log tmpfs defaults,noatime,nosuid,mode=0755,size=100m 0 0

Works a treat but for one tiny item – Apache would not start up!

Several people have mentioned RAMLOG – but from what I can see –that no longer works with Jessie (the problem of old material hanging around on the web. This looks modern – and is reasonably straightforward to install – takes just a couple of minutes. https://github.com/azlux/log2ram – I installed it – and it works at treat. The default action is to update the disk every hour  – but moving the file “log2ram” from /etc/cron.hourly to /etc/cron.daily to me makes more sense.

So many questions – so many potentially wrong answers. See comments about actual number of writes to SD – would you believe any given location (not the one you see but the REAL location) could be as low as 1000s rather than 10s of thousands  – I had no IDEA it was that low).

On the subject of power supplies, in the comments you’ll find code for testing the likes of the Raspberry Pi – as there are registers in the Pi which pick up voltage issues… I was horrified how easy a long USB lead would allow the the Pi to work – but continually to register issues.

In testing – I found comments from TKaiser useful – then when wondering about the CPU frequency I found THIS article – and the associated script useful..

http://megakemp.com/2013/02/26/adventures-in-overclocking-a-raspberry-pi/

So already we see a need to reduce writes, only use good, tested SDs, use good good supplies with short leads. Not new, not rocket science but I am seeing some good science behind the need for this and look forward to reading more of your educated comments.

Keep the comments coming!

A Little Test

In the process of this discussion, TKaiser supplied us with a little script to return some information about power from the likes of the Pi2 or Pi3. This was intended to be used as a command line tool – repeating until told otherwise. Well, I like REPORTS…

I took out the loop section so as to return a single line of information – and that can conveniently be run in an EXEC node in Node-Red

Node Red showing Pi variables

 

I changed the script to simplify output – if someone can tell me how to produce output without “’C” and “V” so we have just numbers coming out – would be nice… I called this tk2.sh (changing permissions – don’t forget) and ran that inside an EXEC node in Node-Red…

[pcsh lang=”js” tab_size=”4″ message=”” hl_lines=”” provider=”manual”]

     Maxfreq=$(( $(awk '{printf ("%0.0f",$1/1000); }'  </sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq) -15 ))
    Health=$(perl -e "printf \"%19b\n\", $(vcgencmd get_throttled | cut -f2 -d=)")
    Temp=$(vcgencmd measure_temp | cut -f2 -d= | tr -d C | tr -d \')
    RealClockspeed=$(vcgencmd measure_clock arm | awk -F"=" '{printf ("%0.0f",$2/1000000); }' )
    SysFSClockspeed=$(awk '{printf ("%0.0f",$1/1000); }' </sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq)
    CoreVoltage=$(vcgencmd measure_volts | cut -f2 -d= | sed 's/000//' | tr -d V)
    if [ ${RealClockspeed} -ge ${Maxfreq} ]; then
        echo -e "${Temp}$(printf "%5s" ${SysFSClockspeed}) $(printf "%019d" ${Health}) ${CoreVoltage}"
    else
        echo -e "${Temp}$(printf "%5s" ${RealClockspeed}) $(printf "%019d" ${Health}) ${CoreVoltage}"
    fi

[/pcsh]

(If you see a question mark above – the word is sampling)

As you can see we have some space-delimited values! If you look at the bitfield, semi-permanent recordings of issues are on the left (most significant bits) while on-going issues are on the right. Extracting from TKaiser’s notes..

The bits on the right are:

0: under-voltage

1: arm frequency capped

2: currently throttled

And corresponding on the left:

16: under-voltage has occurred

17: arm frequency capped has occurred

18: throttling has occurred

It is easy enough to break this down..

Here is another version where I have split up the values

tmpA7A1

The first is the input – the second is the split version – the same except they are now in 4 different places

var reading=msg.payload.split(” “);
msg.payload=reading[0] + ” ” + reading[1] + ” ” + reading[2] + ” ” + reading[3];
return msg;

And from there you can do what you like with the data of course – one idea might be to read every minute and turn that string bitfield into an integer,  totalling up errors in the lower bits (you could just read bits of the string to achieve the same thing) … after a period send off a report email…

No need to report over-heating as the governor should take care of that – however – min-max summary in the email might be nice while testing.

Facebooktwitterpinterestlinkedin

142 thoughts on “A Question of Lifespan

  1. Hi all,

    I love home automation (as do we all I’m sure) and really like Peter’s contributions to the node-red community. This article has been an interesting read but I didn’t find an answer to my question so I’ll put my quandary to the good folks here.

    I started using node-red to control MQTT based devices around my home (mostly lights) but also to control a hydroponics system I have set up. It’s based on a RasPi 2 and has grown to include dashboard buttons controlling 8 gpio’s, several slider controls, other buttons controlling overall system functions, 6 graphs and 6 dial gauges. The gauges and dials are driven by monitoring files on the SD card. There are several python scripts that run in the background gathering data every few minutes and one runs every few seconds.

    The problem is that lately the editor and the dashboard have become VERY slow, almost unusable, and I was wondering if I have hit an upper limit on node-red’s capability, h/w limitations or the following…

    As this article points out, SD writes will eventually cause a system failure so in order to reduce the number of writes to the SD card (10,000 writes goes by fast when you are data-logging every few seconds), I changed my python data gathering scripts to write to a network mounted SAN drive instead of the SD card.

    Does anyone have experience with a similar issue? Is the SAN mounted drive likely the bottleneck or should I upgrade my h/w or try a ram disk or even a USB drive solution? I also thought about offloading the gpio control to an ESP8266 and just have the RasPi do the MQTT and node-red parts but this seems to me of minimal value at best.

    Any suggestions would be appreciated.

    Bruster

    1. Ok I have written to the supplier. If you recall in another blog entry I wrote about similar units. A MAJOR stumbling block being that when the battery was completely flat, as would happen if the power failed for any reason, such units would NOT start up until the load (Pi) had been disconnected. Also – some of the battery units these are used in need a button pressing to start them up.

      So before using these we need to confirm that firstly they WILL deliver up to 2.5 amps while simultaneously being charged, that they will do this without button pressing and that they will recover from a flat battery situation.,

      If they do this – you’d better get in quick as I’ll be buying out their stock 🙂

      The only issue here may be language – it is China and so my email may or may not be understood. Time will tell.

  2. Ref SD corruption, my vote goes with looking at what OpenWRT and LEDE do with regards to filesystems; they have long experience in this area.
    Having the base FS readonly, and an overlay over it for modifications, with an easy mechanism to restore to ‘factory’ if the overlay corrupts.
    But on something like an SD or USB, some knowledge of where the erase boundaries are would be important – no point in the read-only part sharing an erase sector with some dynamic content. Great thing about this is that it will almost always boot even with a corrupt overlay, as most of the meaningful stuff is in the read-only part.
    I prefer even better to have the overlay on a different media – e.g. boot from read-only SD, overlay on disposable (backup-able) USB stick.
    My personal experience is zero corruption on openwt (on maybe 10 devices) (ok, corruption on one device – because the nand interrupt was shared with the wifi interrupt! – a very bad move by BT); but I’ve had to recover my OrangePi three times from filesystem corruption (not SD card failure.. just corruption) in 4 months.
    Oh, and having worked with SSDs and analysed the controllers therein, I know I’ve never owned an *SD card* or a *USB stick* with any form of wear leveling :(. so a journaling FS which walks the complete card before re-using parts is good (like NTFS). An FS which repeatedly writes the same location when overwriting the same file (like FAT) is bad… – pity my direct experience of our linux filesystems is lacking on this point :).

    1. and it is such a good link I would incorporated it into the blog entry – but I’m not entirely sure it has much to do with out little SSDs however. We now know from discussions what “wear levelling” is but it still seems to be less than obvious getting information as to how this is implemented in the various SSDs we use.

  3. Yes Antonio – very bad board. Andreas is thinking of using a little micro to control one of these supplies… and thinking about it – not that big a deal.. What is needed as the battery voltage goes down, say, to 3v or so is to activate an output – which the Pi would pick up – and turn itself off. Now at this point you can’t rely on battery voltage as it would immediately rise due to the load disappearing. So I’m thinking the next step would be to wait, say 3 minutes, beeping while you wait and then turn off the power to the Pi, Given a suitable relay and bearing in mind you only have 3v, this might take some thought. At this point the voltage would shoot up so you could not do a simple voltage comparison – and you’d have to make sure your A/D convertor was running on the internal lower voltage reference… I’m thinking the thing to do next would be one of two things (after turning the alarm off). Wait to see what the voltage does – do nothing for 10 seconds after the power to the Pi is switched off.

    At this point I’d be taking 10 readings 1 minute apart – if all 10 show reading on reading increase – and the battery is above say 3.5 volts??? Turn the power back on to the Pi.

    Of course on power up of the little controller – the battery could be flat or full – if full you would not SEE any increase in voltage over time so perhaps in that case – on power up if the battery voltage is above 3.8v – just turn the Pi on.

    These are just thoughts and it would not take much in terms of time or money to make such a little adaptor – but that relay might take some thought!

    1. BTW: That’s why playing with tablet SoCs that come with PMIC (support) is always a nice approach when dealing with batteries. Since if set up correctly everything is already in place to deal with such situations (since SoC/PMIC combo made for exactly that).

      I’m still hoping that those Chinese board makers throwing cheap H3 boards at us (H3 is a TV box SoC lacking direct PMIC support) will pick up similar cheap R SoCs (which are just renamed Allwinner A series SoCs, R8 used on CHIP is an A13 in reality, R16 is an A33 and R18 an A64, R58 an A83T and now it seems R40 has been renamed finally to A40). Those SoCs contain an own OpenRISC ultra low power core to deal with deep sleep states and together with PMIC + battery all that’s needed is just software + external power source.

    1. Kinda off-topic in this thread but anyway: such solar powered setups suffer from two potential problems: you need power banks that allow to be charged and provide power to connected devices at the same time (not all and especially the cheap ones do!) and then I doubt you can see anything on such a display in bright light (I even bought a kinda oversized Kindle to be able to work in sunny environments 😉 )

  4. Many thanks to everyone for the really useful comments here, especially tkaiser for the detective work. I’ve had time to run tests on the RasPis around here, and as expected they all have power problems. Interestingly the two with best results have the current all-white “official” RasPi PSU. I’m going to experiment on one, with a shorter thicker cable to the multiple +5 and ground pins on the GPIO header. I wonder if there could be a market for a “high performance Pi” PSU with GPIO connector?

    1. And now imagine how many RPi 2/3 users are in the same boat but just don’t know 😉

      BTW: I would not power through GPIO pins on Raspberries without taking a lot of care since you’re bypassing some protection (this is different on a lot of other SBC where there’s no difference).

      Better buy short 20AWG rated USB cables (if you buy 3 or 5 they’re really cheap on eg. Amazon) or a dedicated good PSU (5.15-5.2V with a short cable) or those guys combined with a good normal 5V PSU: https://aliexpress.com/item/DC-5-5×2-1-mm-Jack-To-Micro-USB-B-5P-Right-Angle-Male-Power-Adapter/32304571289.html (please be careful, 5.5/2.1mm is also used by a lot of 12V PSUs!)

      1. I believe I am with you on this TKaiser – stick with the micro-USB connector – but short, thick leads and decent supply. That 90 degree adaptor is useful as it is easy to make your own leads that way – incidentally the postage is significant on that one so if anyone sees this item at a better price feel free to put a link in here – along with any links to PROVEN decent leads.

          1. Right – good stuff Antonio – I looked at the English-speaking AliExpress and some offers looked great until you took postage into account – Spain being worse than the UK. Then I came across this link – which I’ve shortened for brevity..

            http://bit.ly/2oIguAy

            That works out at under £1 each – the short length of cable will have negligible effect and it means you can use a decent supply with thick 2-core cable – well spotted.

            This thread is turning out to be one of my better ideas – lots of good stuff in here.

            I’ve ordered 5 to get me started. Must make some simple Node-Red to email me if there are any power supply issues.

          2. I ordered some cheap ‘USB type A to 5.5/2.1 mm cables’ recently on Aliexpress just to throw them in the bin in the meantime.

            Idea behind: I do a lot of precise consumption monitoring stuff (mis)using a Banana Pro as ‘measuring PSU’ powering other boards through one of its USB ports. Banana Pro like most other Allwinner A20 boards features a power management IC that is able to measure current/voltage and with some patience (30 min average values) you get pretty precise consumption numbers of connected boards.

            Only cables missing were those for ‘5.5/2.1mm barrel jack’ only boards. But those cables from Aliexpress were a joke. 50cm long, pretty thin and obviously resistance (impedance?) way too high. Only boards that boot with these cables are OPi Zero and NanoPi NEO/Air with Armbian’s IoT settings (limiting clockspeeds to the min to lower consumption). With normal settings even those boards freeze/reboot at startup.

            TL;DR: Don’t trust cheap cables even if they’re short 😉

            1. the ones we bought are about 10cm, and cable seems thick… sure, cables inside could be thin… let’s try, but i already ordered microusb and others connectors to do my own cables… btw, the Aukey short ones i’m using seem good…

            2. While I understand the sentiment, given the lack of precise information at point of sale it isn’t easy to tell if a cable is “expensive” or someone is just making more profit. Hence general request for “proven” links.

              1. i know, but to be proven, we need:
                1) someone to actually BUY those items
                2) same one to wait, even 2 months…
                3) always the same one to test it…

                by myself, i’m at point 2… tic… tac… tic… tac… tic… tac… tic… tac…

                1. Hah, well I have ordered the little 90degree adaptors – couple of weeks.

                  Also on that Pi Power supply I reviewed a while ago – they gave me a refund as it was rubbish, resetting when you disconnected power – well, Andreas may be right – I put a 330u cap on it and it is ALMOST ok (doesn’t crash but the red LED on the Pi flicks) , I’m going to copy him and put a 3,300u cap on the output. Only issue right now is they are cheap – but the thieving suppliers in the UK want far more in postage than the cost of the caps – so – Hong Kong it is.

                  1. Ok, scratch that – Andreas has confirmed that the unit if allowed to get quite flat does not start up properly before being “primed” by an external supply. I hope the Chinese developers are reading this… WHAT WERE YOU SMOKING WHEN YOU DESIGNED THIS!!!???

  5. side note: in the search of the perfect server to host my setup, recently my cousin dropped an Asus T100, one of those hybrid mini notebook with detachable keyboard and 10″ lcd touchscreen… she was going to throw it in the trash bin, while i asked to give it to me… well, as you can see, shattered touchscreen, but screen is fine… keyboard not working, but as all the electronics is in the lcd half, i removed it…

    so now i’ve a baytrail atom 4 core 1.46ghz, with 2gb ram, a decent 32gb emmc, an hdready lcd, and most important of all, an 8000mah battery, which with windows last about 10 hours, so i think with screen off and only using it remotely via ssh and all THE SCRIPT goodies, it should last i think a full day…

    using a microusb hub i connected both a keyboard/mouse combo and an usb dongle with which i installed ubuntu xenial (little problems to deal with EFI Bios, nothing unresolvable with a little of time), script completed in 25 minutes, included tons of updates as the iso was an year old (now we are at 16.04.2, mine was .0, now is .2 after online updates)… it’s fast, it lasts long, it has x64 ubuntu on intel version, i can manage everything, have to try the embedded camera, too…

    well, for something that was going to be discarded… i plan to attach its PSU to a sonoff so my setup can turn it on when battery under 10%, without having to leave it always connected… i’m very happy with it! 🙂

  6. Tested my new SanDisk 16GB Ultra Micro SD Card (SDHC) with h2testw_1.4. Passed 🙂 Bought this from MyMemory in UK but sent from Switzerland. (I don’t like that the new Pi 3 doesn’t have a spring loaded SD slot – makes it more difficult to remove when the Pi is in a case).

    Yet to test the PSU (thanks tkaiser for the process) as literally spent all day trying to get a bluetooth speaker working from the CLI. Just sussed it 🙂 Jessie has BT built-in but obviously not complete as it only worked via the GUI.

  7. Yes, this thread does get very technical but if at the end of the day it can come up with some simple changes newbies can implement to minimise risk, then it will be very worthwhile 🙂

    I’ve just received my 2nd Pi (a model 3 this time) and another SanDisk 16GB Ultra Micro SD Card (SDHC). What I also ordered was the Official Raspberry Pi 3 Universal Power Supply. At £7 why buy anything cheaper.

    In this thread there are many comments about using quality PSU. Any opinion on these RPi labelled PSUs?

    PS. my 2 year old official PSU was by Stontronics (model DSA-13PFC-05 FCA). The latest is by RS Components (model DSA-13PFC-08 FCA). Both cables marked 18AWG. One was bought from Pi Hut and the other Pimoroni , both seem to be leading Pi retailers, so hopefully no fakes.

    1. The official RPi 3 PSU is a good choice but why trusting in blindly. Simply connect as much USB peripherals as possible and run a ‘stress -c 4 -t 600’ and then check the output from

      perl -e “printf \”%19b\n\”, $(vcgencmd get_throttled | cut -f2 -d=)”

      If there’s a ‘1’ on the left you suffer from powering problems and when the next digit is also ‘1’ then overheating also occurs. It would be so easy for Raspbian guys to call the above on shutdown and redirect output to eg. /var/log/raspberry-health.log, check this as part of a boot script and warn users if either/or happens so they can act on accordingly (improve power supply, add a heatsink).

  8. Thanks to all for feedback on this subject – this is great and it really much in line with what I’d hoped for this blog when I started it as a simple reminder to myself as to how I’d made things work (I have a memory like a sieve).

    Here we have some great collected information about using SDs from which I’m sure others will benefit. I have already implemented the RAM disk for logs – working a treat though I think I need to check it saves them on power-down as well as on a timed basis – and I now know of some dead routes NOT to go down. I’m sure the discussion will continue – and once I’m settled down in Spain next week hopefully will start to make use of this. I’ll be taking a little time out to install and test my new 250w solar panel and the electronics that go with it – I’ll cover that on the blog.

    For now however – thanks to everyone for their continued input of knowledge and enthusiasm.

    4 day trip coming up much of which is broadband-free.

  9. The script on megakemp.com is unfortunately neither suited for RPi 2 nor 3 since it does NOT report the real clockspeed of the CPU cores.

    For whatever reasons the RPi kernel is lying wrt /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq in throttling situations. Your RPi 3 without a heatsink in a tiny enclosure might throttle down to 601 MHz and this script (or almost EVERY OTHER script available on the net) will still report 1200MHz. Only if throttling reaches 600 MHz then the real clockspeed will be reported (same with RPi 2 but there’s 600 vs. 900 MHz)

    I sent them a comment on github and hope they update the script and use ‘vcgencmd measure_clock arm’ instead to report the real clockspeed and not random numbers.

  10. Notes as I go along… with proper supply and decent leads I see continuous zeroes on TKaiser’s test… it may or may not be relevant but the Pi which had an issue with SD – has power supply under-voltage issues. Meanwhile H2testw just passed my new Amazon Sandisk ultra 16GB with flying colours.

    1. Good to hear! I always make another test after getting new SD cards and that’s running the iozone call from here (the ‘standard performance test’as i would call it): https://forum.armbian.com/index.php?/topic/954-sd-card-performance/

      Sadly Raspbian doesn’t ship with iozone installed and there’s also no iozone3 package available (this is really sad since people then start to use hdparm/dd which produce just numbers without meaning and do not even test the most import numbers: random read and especially write performance).

      But there’s always a simple script around the corner, just use Jeff’s curl call from here: http://www.pidramble.com/wiki/benchmarks/microsd-cards

  11. Love this whole topic. I also beleive that a system should stay up and running for years (decades) if it doesn’t need to be rebooted (firmware upgrades/patching or power shutdown).

    I ran into the USB cable issue when I found I could charge my phone or tablet in about 45 minutes (from dead to full) with the cable provided with the 10AHr battery. I’ve has original Pi 2s that blew up the SD (didn’t know about the firmware bug, good to know) and want more than anything to limit what is written to the SD (rsyslog *.* @mozart.uucp:514). I just read the above and want to thank everyone for sharing. I now have to reread and take notes. 🙂

    Pete, let me see if I can figure out the NFS root for the Pi 2. I’ve got a spare with a broken SD card holder. I can use that one and tape the card on place. I won’t be able to test that right away though.

  12. GUYS – thank you all for your contributions here but can I ask a favour. The reason I put this blog entry up was to make accessible in a single place a discussion that helped us move forward – and especially those who don’t understand the issues – sadly some of these responses are very “Geek Only” making large assumptions about knowledge and using many abbreviations.

    Can we keep it simple where possible ( I know that some of the problems are NOT simple) – there are a couple of paragraphs in the above responses which may as well be in Chinese for a lot of readers.

    Thanks again for all the feedback.

    1. Pete, (with respect as it is your site), why not let the “Chinese” flow?
      I for one, always feel privileged to read “Deep Geek sharing” on your blog and appreciate all who take the time to post.
      If it’s something I haven’t seen before (Great!), I really like the links so I can begin to translate the “chinese” for myself” 🙂

      Howabout when things settle down, just summarise the main findings e.g.
      – Use a better power supply (e.g. link)
      – Use larger SD cards
      – Check for bit settings (which record power failures)
      – Consider use of external storage with Storage (EXT?) setup on this device (USB memory?)
      – Change the script to output logs etc to external memory
      – Other…

      In the end, the objective is the same thing really – to stop a Pi crashing with memory problems?

      1. Ok you’ve talked me into it – don’t anyone complain I didn’t try to reduce the jargon – as for that BIT – I’m not convinced, waiting a response from TKAISER. with the mouse left connected I get that “1” despite VERY little power voltage difference – I’m wiring up my 3 amps 5v linear supply right now to try bypassing the usb connector to test that further. More here as I go on.

        1. Where do you got this power meter with Micro USB in and Micro USB out? If you use an usual ‘USB Charger Doctor’ or stuff like that you’re measuring at the wrong end of the cable. The voltage drop is the result of cable and contact resistance so you would have to measure on testpoints TP1/TP2 on the RPi itself (most likely GPIO pins 2 and 6 will also do while you generate some load with ‘stress -c 4’ or something like that)

          1. Hi Tkaiser….. I’m measuring the voltage on connectors 6 (ground) and 4 (5v) on the Pi itself, I’m using a thick cable from a professional linear PSU. The power is also going to pins 6 and 4 for the purposes of this test.

            1. Sorry – to complete – and using a decent digital multimeter – not one of those cheap and nasty “USB Charger Doctors” – which I must say ARE good for weeding out really rubbish cable.

              For the purpose however of validating this continuous readout of frequency and temperature of yours – which I have running in a separate window – I am not seeing an indisputable tie up between frequency and voltage OR frequency and temperature, despite testing voltages AT THE PI2 of 4.9 at one extreme and 5.3 at the other.

              So help me out here – what can be accounting for the fact that there seems to be no clear correlation?

              1. Now tested on and adopted for RPi 2 too: https://pastebin.com/t5vTfSPa

                To get results that seem not that weird (caused by so called ‘cpufreq governor) switching to ‘performance’ might make some sense: echo performance >/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

                And then it only gets interesting comparing the way you power the Pi now and back through Micro USB.

      2. It’s not ‘better power supply’ since the problem is the cable between power supply and the board. You can use a 3A rated PSU and will get in trouble for sure if the USB cable used has a resistance too high (and that applies to 99.99% of these cables since they were designed for 500mA max!) But you can use an 1A rated PSU with a good cable and are fine.

        And that’s the only reason those ‘good’ power supplies all provide not a type A receptacle but a fixed cable and provide 5.1V at least since the real problem also 99.99% of Pi users are unaware is called voltage drop (while anyone talks about the power brick and amperage there). Once you fixed undervolting and learned how to avoid crappy SD cards almost everything is already fine 🙂

        1. Yes appreciate comment about power supply – what I’m using is a lab supply with thick speaker cable and measuring the voltage at the Pi2.

          1. Nice to have a test script. I tried this on a Pi2

            The first half were predominantly taken with the voltage at 4.95v – the second half at 5.2v – as you can see, no-where near conclusive – so to recollect this is a power supply feeding pins 4 and 6 on a Pi2 (with nothing else connected other than hardwired ethernet).

            The voltage generally goes up but the frequency.. well, see for yourself.

            So to see if there was noise involved I tried putting an electrolytic across the pins and even a supercap… no significant difference.

            39.5’C 600 MHz 1010000000000000000 1.2V
            40.1’C 600 MHz 1010000000000000000 1.2V
            39.5’C 600 MHz 1010000000000000000 1.2V
            40.1’C 900 MHz 1010000000000000000 1.3125V
            40.1’C 600 MHz 1010000000000000000 1.2V
            39.5’C 600 MHz 1010000000000000000 1.2V
            39.5’C 900 MHz 1010000000000000000 1.3125V
            40.1’C900/600 MHz 1010000000000000000 1.3125V
            39.5’C 600 MHz 1010000000000000000 1.2V
            39.5’C900/600 MHz 1010000000000000000 1.3125V
            39.5’C 600 MHz 1010000000000000000 1.2V
            40.6’C 900 MHz 1010000000000000000 1.3125V
            40.1’C 900 MHz 1010000000000000000 1.3125V
            40.1’C 600 MHz 1010000000000000000 1.2V
            40.1’C 600 MHz 1010000000000000000 1.2V
            40.1’C 600 MHz 1010000000000000000 1.2V
            40.1’C 900 MHz 1010000000000000000 1.3125V
            39.5’C 600 MHz 1010000000000000000 1.2V
            40.1’C900/600 MHz 1010000000000000000 1.3125V
            40.1’C 600 MHz 1010000000000000000 1.2V
            40.1’C900/600 MHz 1010000000000000000 1.3125V
            39.5’C 600 MHz 1010000000000000000 1.2V
            40.1’C900/600 MHz 1010000000000000000 1.3125V
            40.1’C 900 MHz 1010000000000000000 1.3125V
            40.1’C 600 MHz 1010000000000000000 1.3125V
            40.1’C900/600 MHz 1010000000000000000 1.3125V
            40.1’C 600 MHz 1010000000000000000 1.2V
            40.1’C900/600 MHz 1010000000000000000 1.3125V

            So to iron out power supply issues I scrapped all that and tried a RAVPOWER battery – this is a (genuine) 26AH battery pack able to put out 2.1 amp usb – into a lead I bought for charging my ridiculously high power phone – and it does a better job than other methods.

            Here are the readings…. RPI2 using new Raspbrian – nothing plugged in except Ethernet. Voltage read at pins 4 and 6 – 5.03v

            To stop simply press [ctrl]-[c]

            41.2’C 900 MHz 0000000000000000000 1.3125V
            40.6’C 600 MHz 0000000000000000000 1.2V
            40.6’C 600 MHz 0000000000000000000 1.3125V
            40.6’C 600 MHz 0000000000000000000 1.2V
            41.2’C 600 MHz 0000000000000000000 1.3125V
            40.6’C 600 MHz 0000000000000000000 1.2V
            40.6’C 600 MHz 0000000000000000000 1.2V
            41.2’C 900 MHz 0000000000000000000 1.3125V
            40.1’C 600 MHz 0000000000000000000 1.2V
            40.1’C900/600 MHz 0000000000000000000 1.3125V
            40.6’C 600 MHz 0000000000000000000 1.2V
            40.1’C 600 MHz 0000000000000000000 1.3125V
            40.6’C 600 MHz 0000000000000000000 1.2V
            40.6’C 600 MHz 0000000000000000000 1.2V
            42.2’C 900 MHz 0000000000000000000 1.3125V
            40.6’C 600 MHz 0000000000000000000 1.2V
            41.2’C 600 MHz 0000000000000000000 1.2V
            41.2’C 900 MHz 0000000000000000000 1.3125V

            So – I know you tried this on a Pi3 (I don’t have one handy – it is running the house) but on a Pi2 I am not seeing an great correlations here – the only difference being that on the battery you’ll notice all zeros….

            Those zeros were consistent over several power cycles..

            I then took the SAME short lead to the Original power supply

            44.4’C 900 MHz 0000000000000000000 1.3125V
            44.4’C 900 MHz 0000000000000000000 1.3125V
            44.4’C 900 MHz 0000000000000000000 1.3125V
            43.3’C 900 MHz 0000000000000000000 1.3125V
            42.8’C 600 MHz 0000000000000000000 1.2V
            42.8’C 600 MHz 0000000000000000000 1.2V
            42.8’C 900 MHz 0000000000000000000 1.3125V
            42.2’C 600 MHz 0000000000000000000 1.2V
            42.2’C900/600 MHz 0000000000000000000 1.3125V

            back to the ORIGINAL supply with the longer (2ft) lead

            44.4’C 900 MHz 1010000000000000000 1.3125V
            44.4’C 900 MHz 1010000000000000000 1.3125V
            44.4’C 900 MHz 1010000000000000000 1.3125V
            44.4’C 900 MHz 1010000000000000000 1.3125V
            44.9’C 900 MHz 1010000000000000000 1.3125V
            43.3’C 600 MHz 1010000000000000000 1.2V
            42.8’C900/600 MHz 1010000000000000000 1.3125V
            42.2’C 600 MHz 1010000000000000000 1.2V
            42.8’C900/600 MHz 1010000000000000000 1.3125V
            42.8’C 600 MHz 1010000000000000000 1.2V
            42.2’C900/600 MHz 1010000000000000000 1.3125V

            So – conclusions so far…

            Short lead important to the binary errors shown but makes NO apparently difference to frequency or internal voltage.

            Smoothing cap at Pi end makes no apparent difference.

            Actual voltage going in makes little difference from 4.9 to 5.2v

            1. The problem is that your test setup also has both cpufreq scaling and ‘dynamic voltage frequency scaling’ active. Which can be confusing to say the least. If you do a ‘cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor’ it will most probably be ‘ondemand’. And this leads to a kernel driver switching between 600MHz/1.2V (idle) and 900MHz/1.3125V (some or heavy load).

              As soon as you switch to performance it starts to get interesting since then RPi should remain at 900MHz/1.3125V all the time. With the crappy cable it will now start to drop down to 600MHz/1.2V even under full load while with your good bench PSU it will stay at 900MHz/1.3125V. The monitoring script’s output is only interesting when there’s some load on the RPi (eg. ‘stress -c 4’) or some power hungry USB peripherals.

              And when powered through USB you’ll see ‘1010000000000000000’ switching to ‘1010000000000000101’ and should also be able to measure the voltage drop on the GPIO pins while you’ll also see that max internal voltage will be limited to 1.2V and cpufreq to 600MHz even if there’s a task like stress hammering the CPU cores.

              Damn: I’ve no idea how to conveniently switch cpufreq governors with latest Raspbian (am using an OpenMediaVault build right now where it’s ‘cpufreq-set -g performance’)

              1. Thanks for this TKaiser – it is all slowly starting to dawn on me – and hopefully through this dialog on others.

                So I was on the wrong track looking for speed changes related to power. That is set by load – I think the next job is to find the easiest way to set the pi to run at full power all the time unless it is overheating – which is what I THOUGHT it was – but it isn’t.

                The relationship between average voltage and errors is not at all clear but the relationship between lead IMPEDANCE (resistance is good enough) is very clear.

                My layman’s version of this suggests that while all my solutions generate enough voltage, the nature of Pi requirements appears to be that it has high instantaneous peak demands that are not met with thinner leads and/or inadequate power supplies…. and now we have a simple test.

                What we still don’t seem to have is the ability to RESET those error test bits – eight now one glitch would trigger them – and so we’ve no idea HOW bad things are. Fixing that so that, say, Node-Red could be continually testing power quality – would indeed be a nice step forward.

                1. It’s all there, we have an obscure reading of 19 bits, the 3 on the right side describe actual status, while the 3 on the left are set to 1 once the indiviual trigger has been fired:

                  The bits on the right are:
                  0: under-voltage
                  1: arm frequency capped
                  2: currently throttled

                  And corresponding on the left:
                  16: under-voltage has occurred
                  17: arm frequency capped has occurred
                  18: throttling has occurred

                  So if you see ‘1010000000000000000’ then 0 and 2 were set to 1 in the past (but are fine right now) so 16 and 18 remain at 1 until next reboot. After my test an hour ago though I would say the explanation for 1/2 and 17/18 should be exchanged since bits 1 and 17 were only set to 1 after I put a pillow on poor RPi 3 to heat it up above 80°C.

                  And if thermal throttling occurs the numbers get really weird since then maximum cpufreq will still be reported (900 MHz on RPi 2 and 1200 MHz on RPi 3) while in reality the clockspeed is already much lower. That’s close to fooling users 🙁 But fortunately that’s an RPi 3 only problem since it’s really hard to get the other Raspberries to overheat.

                2. Impedance as an electrical engineering term includes inductance/resistance/capacitance.

                  Power leads of USB cables have an inductance of about 0.5 … 1 uH (microhenry) per meter. Thick or thin cables doesn’t matter, as long as the ratio of copper to insulation diameter is constant (thicker wires tend to have thicker insulation). Short cables have a low inductance, thick cables are about the same as thin ones.

                  High inductance hinders the power supply to quickly react to changes in current draw (well, the power supply does react, but the increased current reaches the load to late).

                  If there are long wires, bypass capacitors are important. The increased current (an *only* the difference to the former current) has to come from the capacitors, which will slowly discharge, and recharge again when the increased current reaches the load. If the capacitors are to small, their voltage will have dropped to much.

                  There is also a difference between the first RPi1 and almost every other board – the RPi1 used an LDO, while modern boards use synchronous buck converters. The RPi1 LDO has a dropout of about 1.0 … 1.2 V at 500 … 800 mA. As soon as the input voltage drops below 4.5 V, the output voltage drops as well. This is especially important for the SD, which typically needs 3V3.

                  The RPi1 A+/B+ and any later uses a synchronous buck converter, which can deliver 3V3 even at an input voltage of 3.8V (although the USB ports will be out of spec). The Pine64 is able to run at 3.7V and less (it is designed to be powered from a single LiPo/LiIon cell), and when you select the “BAT”tery mode, even the 5V rails are powered correctly from the step up converter on the board.

    2. I’m glad this is actually being seen to help people – because that is why I made this a separate subject. We are on the verge of having a set of requirements to maximise the life of SDs and minimise power-related issues – all in one long conversation! Very good. Much of this has been discussed but SO many articles out there leave one wondering.. and not a lot further forward. I am as of right now throwing away long USB leads – and testing my new SDs with H2TestW. If we come up with a way to reset those flags which pick up power errors, we’ll have the ability to constantly monitor issues with Node-Red for example – and report by email or whatever any issues – add to that the log2ram program (recall I read all the way through ramlog docs only find out it (apparently) no longer works for Jessie – well log2ram does because I’m now using it – about to test it on other boards).

      1. Of course you don’t use wear-aware fs on media which does the housekeeping itself. Since those filesystems designed to deal with mtd devices (those without a controller) will lead to faster wear-out since two instances trying to optimize at different layers are effectively fighting against.

        It’s the same with trying to ‘optimize’ filesystem by running defragmentation software on SSDs: not a single advantage but decreasing lifespan.

  13. Interesting idea on using RAM. I sell an industrial data logger and we used Windows XP Embedded (now using Windows 7 Embedded) and compact flash. We use the EWF option which means that all disk use during operation to the C: drive is in RAM. For data we store this on a D: drive compact flash. We don’t continually write to this.
    After some 10+ years these machines are still in operation with only the occasion corruption to the OS disk (usually due to large power spikes). If we want to make changes to the C: drive, there is an option to commit changes if you install any drivers etc. If you switch it off by doing a shutdown or pull the plug, nothing happens to the C: drive.
    Someone once tried to update their machine by installing standard Windows XP to the compact flash. It last about 2 months before the disk was corrupted and read only. 🙂
    Granted our use is down to single application use and this method fits well, but I think this would work fine with the embedded devices we use and reduce the risk of corruption or even just power loss.

    1. Similar approach with Linux based embedded systems is also easy (UnionFS for example, but I prefer on the various ARM boards I use running Ubuntu Xenial with overlayfs since integrated nicely in Armbian). Adding a 2nd disk for real data with controlled behaviour (as you said ‘don’t continually write to this’) and wear-out is close to zero.

      But in my opinion it’s absolutely OK to let Raspbian or other OS run in normal mode off a quality SD card or other flash media as long as those writes that come with insanely high write amplification are handled correctly ((those ‘just a few bytes every few seconds’ logging stuff mostly — stuff for log2ram, folder2ram and friends).

      Since all storage media will eventually die and neither HDDs, thumb drives nor SD cards feature a ‘health forecast’ (on good SSD you query the controller’s estimate eg. SMART attribute 233 ‘Media_Wearout_Indicator)’ on Intels or 177 with Samsung — unfortunately vendor specific) we simply try to use btrfs whereever possible since as soon as monthly ‘scrubs’ comparing checksums with real data indicate data corruption starting it’s time to replace the media. ReFS in Windows scrubs by default too but I doubt it’s available for Win 7 Embedded 🙁

Comments are closed.