A Question of Lifespan

Here we started a conversation about SD Lifespan. How can we make our SBC projects run for longer? In comments I’ve seen elsewhere, people seem to think it is OK that a Pi may well fail within a year due to SD – I don’t think that is even REMOTELY acceptable unless you’re making a novelty games machine.  Read more as we came up with some great solutions…

People make a big deal about the reliability of Linux – not a great deal of use if the entire file system will come to a halt in a year…

I’d never suffered this problem until recently – my first heating system issue has appeared after more than a year’s continuous use (and that includes doing lots of experiments on the same system).  You may have seen comments in earlier blog entries about this – for the first time ever I recently suffered a dead SD on one of my Raspberry Pi projects – stone, cold dead – read only and NOTHING on the Pi or my PC could encourage the SD to write again.

But let’s digress – some of the solutions take time and effort – THIS one is simple.. https://goo.gl/meAAB8  – a DUAL SD adaptor….  if things fail – have someone flick the switch and reboot –  right – back to the plot…. (thanks Antonio)…

It has been said that some cheap SDs are not as large as they seem and as soon as you exceed use beyond their ACTUAL size – the chips become read-only.  I’ve yet to test this out but TKaiser has suggested testing all new SDs and in a previous comment has recommended SanDisk Extreme Plus.

The test program H2TESTW is widely available for free. I’m testing my first 16GB disk now – looks like it will take 20 minutes but as no user interaction is needed… time well spent.

In here you will find questions and opinions. In the comments hopefully you will find some resolution – lots of bright people read this blog and I’m hoping they have solutions rather than opinions.

If you read on the web about the subject of eMMC and SD and USB memory – it is hard to tell what is hard science and what is opinion.

For example there are blogs suggesting that instead of relying on SD, use a USB memory stick. I have trouble with this as the technology is similar. Why should a USB stick last any longer than an SD.

You’ll see reference to eMMC – there can be no doubt that eMMC (usually an internal module or chip) is usually faster than SD – but does it LAST any longer – some say yes, some say no. To be sure it is less convenient to back up compared to an SD you can simply pull out and replicate!

Then there is the hard disk. I have a natural tendency to think that a spinning disk has to be less reliable than solid state memory but every experience I have says the opposite. I could not tell you the last time a hard disk went bad on me. Of course – they tend to be more expensive – and they are very much larger than SD.

The general idea is that you can READ SD as often as you want but there is a limit sometimes described as 10,000 write cycles, sometimes describes as 10 times that amount. I suspect the latter and that there is just a lot of old information out there.

Then there is WEAR LEVELLING wherein some SDs have a chip inside that helps prevent a single location being written to, too many times – knowledge on this seems to be akin to witchcraft. WHICH manufacturers use this in WHICH SDs and HOW effective is it? I’ve not found a single source of information on the subject that is up to date and verified.

Today I read about putting some directories into RAM.

In the /etc/fstab file you can add for example

tmpfs /var/log tmpfs defaults,noatime,nosuid,mode=0755,size=100m 0 0

Works a treat but for one tiny item – Apache would not start up!

Several people have mentioned RAMLOG – but from what I can see –that no longer works with Jessie (the problem of old material hanging around on the web. This looks modern – and is reasonably straightforward to install – takes just a couple of minutes. https://github.com/azlux/log2ram – I installed it – and it works at treat. The default action is to update the disk every hour  – but moving the file “log2ram” from /etc/cron.hourly to /etc/cron.daily to me makes more sense.

So many questions – so many potentially wrong answers. See comments about actual number of writes to SD – would you believe any given location (not the one you see but the REAL location) could be as low as 1000s rather than 10s of thousands  – I had no IDEA it was that low).

On the subject of power supplies, in the comments you’ll find code for testing the likes of the Raspberry Pi – as there are registers in the Pi which pick up voltage issues… I was horrified how easy a long USB lead would allow the the Pi to work – but continually to register issues.

In testing – I found comments from TKaiser useful – then when wondering about the CPU frequency I found THIS article – and the associated script useful..

http://megakemp.com/2013/02/26/adventures-in-overclocking-a-raspberry-pi/

So already we see a need to reduce writes, only use good, tested SDs, use good good supplies with short leads. Not new, not rocket science but I am seeing some good science behind the need for this and look forward to reading more of your educated comments.

Keep the comments coming!

A Little Test

In the process of this discussion, TKaiser supplied us with a little script to return some information about power from the likes of the Pi2 or Pi3. This was intended to be used as a command line tool – repeating until told otherwise. Well, I like REPORTS…

I took out the loop section so as to return a single line of information – and that can conveniently be run in an EXEC node in Node-Red

Node Red showing Pi variables

 

I changed the script to simplify output – if someone can tell me how to produce output without “’C” and “V” so we have just numbers coming out – would be nice… I called this tk2.sh (changing permissions – don’t forget) and ran that inside an EXEC node in Node-Red…

[pcsh lang=”js” tab_size=”4″ message=”” hl_lines=”” provider=”manual”]

     Maxfreq=$(( $(awk '{printf ("%0.0f",$1/1000); }'  </sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq) -15 ))
    Health=$(perl -e "printf \"%19b\n\", $(vcgencmd get_throttled | cut -f2 -d=)")
    Temp=$(vcgencmd measure_temp | cut -f2 -d= | tr -d C | tr -d \')
    RealClockspeed=$(vcgencmd measure_clock arm | awk -F"=" '{printf ("%0.0f",$2/1000000); }' )
    SysFSClockspeed=$(awk '{printf ("%0.0f",$1/1000); }' </sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq)
    CoreVoltage=$(vcgencmd measure_volts | cut -f2 -d= | sed 's/000//' | tr -d V)
    if [ ${RealClockspeed} -ge ${Maxfreq} ]; then
        echo -e "${Temp}$(printf "%5s" ${SysFSClockspeed}) $(printf "%019d" ${Health}) ${CoreVoltage}"
    else
        echo -e "${Temp}$(printf "%5s" ${RealClockspeed}) $(printf "%019d" ${Health}) ${CoreVoltage}"
    fi

[/pcsh]

(If you see a question mark above – the word is sampling)

As you can see we have some space-delimited values! If you look at the bitfield, semi-permanent recordings of issues are on the left (most significant bits) while on-going issues are on the right. Extracting from TKaiser’s notes..

The bits on the right are:

0: under-voltage

1: arm frequency capped

2: currently throttled

And corresponding on the left:

16: under-voltage has occurred

17: arm frequency capped has occurred

18: throttling has occurred

It is easy enough to break this down..

Here is another version where I have split up the values

tmpA7A1

The first is the input – the second is the split version – the same except they are now in 4 different places

var reading=msg.payload.split(” “);
msg.payload=reading[0] + ” ” + reading[1] + ” ” + reading[2] + ” ” + reading[3];
return msg;

And from there you can do what you like with the data of course – one idea might be to read every minute and turn that string bitfield into an integer,  totalling up errors in the lower bits (you could just read bits of the string to achieve the same thing) … after a period send off a report email…

No need to report over-heating as the governor should take care of that – however – min-max summary in the email might be nice while testing.

Facebooktwitterpinterestlinkedin

142 thoughts on “A Question of Lifespan

  1. Hi all,

    I love home automation (as do we all I’m sure) and really like Peter’s contributions to the node-red community. This article has been an interesting read but I didn’t find an answer to my question so I’ll put my quandary to the good folks here.

    I started using node-red to control MQTT based devices around my home (mostly lights) but also to control a hydroponics system I have set up. It’s based on a RasPi 2 and has grown to include dashboard buttons controlling 8 gpio’s, several slider controls, other buttons controlling overall system functions, 6 graphs and 6 dial gauges. The gauges and dials are driven by monitoring files on the SD card. There are several python scripts that run in the background gathering data every few minutes and one runs every few seconds.

    The problem is that lately the editor and the dashboard have become VERY slow, almost unusable, and I was wondering if I have hit an upper limit on node-red’s capability, h/w limitations or the following…

    As this article points out, SD writes will eventually cause a system failure so in order to reduce the number of writes to the SD card (10,000 writes goes by fast when you are data-logging every few seconds), I changed my python data gathering scripts to write to a network mounted SAN drive instead of the SD card.

    Does anyone have experience with a similar issue? Is the SAN mounted drive likely the bottleneck or should I upgrade my h/w or try a ram disk or even a USB drive solution? I also thought about offloading the gpio control to an ESP8266 and just have the RasPi do the MQTT and node-red parts but this seems to me of minimal value at best.

    Any suggestions would be appreciated.

    Bruster

    1. Ok I have written to the supplier. If you recall in another blog entry I wrote about similar units. A MAJOR stumbling block being that when the battery was completely flat, as would happen if the power failed for any reason, such units would NOT start up until the load (Pi) had been disconnected. Also – some of the battery units these are used in need a button pressing to start them up.

      So before using these we need to confirm that firstly they WILL deliver up to 2.5 amps while simultaneously being charged, that they will do this without button pressing and that they will recover from a flat battery situation.,

      If they do this – you’d better get in quick as I’ll be buying out their stock 🙂

      The only issue here may be language – it is China and so my email may or may not be understood. Time will tell.

  2. Ref SD corruption, my vote goes with looking at what OpenWRT and LEDE do with regards to filesystems; they have long experience in this area.
    Having the base FS readonly, and an overlay over it for modifications, with an easy mechanism to restore to ‘factory’ if the overlay corrupts.
    But on something like an SD or USB, some knowledge of where the erase boundaries are would be important – no point in the read-only part sharing an erase sector with some dynamic content. Great thing about this is that it will almost always boot even with a corrupt overlay, as most of the meaningful stuff is in the read-only part.
    I prefer even better to have the overlay on a different media – e.g. boot from read-only SD, overlay on disposable (backup-able) USB stick.
    My personal experience is zero corruption on openwt (on maybe 10 devices) (ok, corruption on one device – because the nand interrupt was shared with the wifi interrupt! – a very bad move by BT); but I’ve had to recover my OrangePi three times from filesystem corruption (not SD card failure.. just corruption) in 4 months.
    Oh, and having worked with SSDs and analysed the controllers therein, I know I’ve never owned an *SD card* or a *USB stick* with any form of wear leveling :(. so a journaling FS which walks the complete card before re-using parts is good (like NTFS). An FS which repeatedly writes the same location when overwriting the same file (like FAT) is bad… – pity my direct experience of our linux filesystems is lacking on this point :).

    1. and it is such a good link I would incorporated it into the blog entry – but I’m not entirely sure it has much to do with out little SSDs however. We now know from discussions what “wear levelling” is but it still seems to be less than obvious getting information as to how this is implemented in the various SSDs we use.

  3. Yes Antonio – very bad board. Andreas is thinking of using a little micro to control one of these supplies… and thinking about it – not that big a deal.. What is needed as the battery voltage goes down, say, to 3v or so is to activate an output – which the Pi would pick up – and turn itself off. Now at this point you can’t rely on battery voltage as it would immediately rise due to the load disappearing. So I’m thinking the next step would be to wait, say 3 minutes, beeping while you wait and then turn off the power to the Pi, Given a suitable relay and bearing in mind you only have 3v, this might take some thought. At this point the voltage would shoot up so you could not do a simple voltage comparison – and you’d have to make sure your A/D convertor was running on the internal lower voltage reference… I’m thinking the thing to do next would be one of two things (after turning the alarm off). Wait to see what the voltage does – do nothing for 10 seconds after the power to the Pi is switched off.

    At this point I’d be taking 10 readings 1 minute apart – if all 10 show reading on reading increase – and the battery is above say 3.5 volts??? Turn the power back on to the Pi.

    Of course on power up of the little controller – the battery could be flat or full – if full you would not SEE any increase in voltage over time so perhaps in that case – on power up if the battery voltage is above 3.8v – just turn the Pi on.

    These are just thoughts and it would not take much in terms of time or money to make such a little adaptor – but that relay might take some thought!

    1. BTW: That’s why playing with tablet SoCs that come with PMIC (support) is always a nice approach when dealing with batteries. Since if set up correctly everything is already in place to deal with such situations (since SoC/PMIC combo made for exactly that).

      I’m still hoping that those Chinese board makers throwing cheap H3 boards at us (H3 is a TV box SoC lacking direct PMIC support) will pick up similar cheap R SoCs (which are just renamed Allwinner A series SoCs, R8 used on CHIP is an A13 in reality, R16 is an A33 and R18 an A64, R58 an A83T and now it seems R40 has been renamed finally to A40). Those SoCs contain an own OpenRISC ultra low power core to deal with deep sleep states and together with PMIC + battery all that’s needed is just software + external power source.

    1. Kinda off-topic in this thread but anyway: such solar powered setups suffer from two potential problems: you need power banks that allow to be charged and provide power to connected devices at the same time (not all and especially the cheap ones do!) and then I doubt you can see anything on such a display in bright light (I even bought a kinda oversized Kindle to be able to work in sunny environments 😉 )

  4. Many thanks to everyone for the really useful comments here, especially tkaiser for the detective work. I’ve had time to run tests on the RasPis around here, and as expected they all have power problems. Interestingly the two with best results have the current all-white “official” RasPi PSU. I’m going to experiment on one, with a shorter thicker cable to the multiple +5 and ground pins on the GPIO header. I wonder if there could be a market for a “high performance Pi” PSU with GPIO connector?

    1. And now imagine how many RPi 2/3 users are in the same boat but just don’t know 😉

      BTW: I would not power through GPIO pins on Raspberries without taking a lot of care since you’re bypassing some protection (this is different on a lot of other SBC where there’s no difference).

      Better buy short 20AWG rated USB cables (if you buy 3 or 5 they’re really cheap on eg. Amazon) or a dedicated good PSU (5.15-5.2V with a short cable) or those guys combined with a good normal 5V PSU: https://aliexpress.com/item/DC-5-5×2-1-mm-Jack-To-Micro-USB-B-5P-Right-Angle-Male-Power-Adapter/32304571289.html (please be careful, 5.5/2.1mm is also used by a lot of 12V PSUs!)

      1. I believe I am with you on this TKaiser – stick with the micro-USB connector – but short, thick leads and decent supply. That 90 degree adaptor is useful as it is easy to make your own leads that way – incidentally the postage is significant on that one so if anyone sees this item at a better price feel free to put a link in here – along with any links to PROVEN decent leads.

          1. Right – good stuff Antonio – I looked at the English-speaking AliExpress and some offers looked great until you took postage into account – Spain being worse than the UK. Then I came across this link – which I’ve shortened for brevity..

            http://bit.ly/2oIguAy

            That works out at under £1 each – the short length of cable will have negligible effect and it means you can use a decent supply with thick 2-core cable – well spotted.

            This thread is turning out to be one of my better ideas – lots of good stuff in here.

            I’ve ordered 5 to get me started. Must make some simple Node-Red to email me if there are any power supply issues.

          2. I ordered some cheap ‘USB type A to 5.5/2.1 mm cables’ recently on Aliexpress just to throw them in the bin in the meantime.

            Idea behind: I do a lot of precise consumption monitoring stuff (mis)using a Banana Pro as ‘measuring PSU’ powering other boards through one of its USB ports. Banana Pro like most other Allwinner A20 boards features a power management IC that is able to measure current/voltage and with some patience (30 min average values) you get pretty precise consumption numbers of connected boards.

            Only cables missing were those for ‘5.5/2.1mm barrel jack’ only boards. But those cables from Aliexpress were a joke. 50cm long, pretty thin and obviously resistance (impedance?) way too high. Only boards that boot with these cables are OPi Zero and NanoPi NEO/Air with Armbian’s IoT settings (limiting clockspeeds to the min to lower consumption). With normal settings even those boards freeze/reboot at startup.

            TL;DR: Don’t trust cheap cables even if they’re short 😉

            1. the ones we bought are about 10cm, and cable seems thick… sure, cables inside could be thin… let’s try, but i already ordered microusb and others connectors to do my own cables… btw, the Aukey short ones i’m using seem good…

            2. While I understand the sentiment, given the lack of precise information at point of sale it isn’t easy to tell if a cable is “expensive” or someone is just making more profit. Hence general request for “proven” links.

              1. i know, but to be proven, we need:
                1) someone to actually BUY those items
                2) same one to wait, even 2 months…
                3) always the same one to test it…

                by myself, i’m at point 2… tic… tac… tic… tac… tic… tac… tic… tac…

                1. Hah, well I have ordered the little 90degree adaptors – couple of weeks.

                  Also on that Pi Power supply I reviewed a while ago – they gave me a refund as it was rubbish, resetting when you disconnected power – well, Andreas may be right – I put a 330u cap on it and it is ALMOST ok (doesn’t crash but the red LED on the Pi flicks) , I’m going to copy him and put a 3,300u cap on the output. Only issue right now is they are cheap – but the thieving suppliers in the UK want far more in postage than the cost of the caps – so – Hong Kong it is.

                  1. Ok, scratch that – Andreas has confirmed that the unit if allowed to get quite flat does not start up properly before being “primed” by an external supply. I hope the Chinese developers are reading this… WHAT WERE YOU SMOKING WHEN YOU DESIGNED THIS!!!???

  5. side note: in the search of the perfect server to host my setup, recently my cousin dropped an Asus T100, one of those hybrid mini notebook with detachable keyboard and 10″ lcd touchscreen… she was going to throw it in the trash bin, while i asked to give it to me… well, as you can see, shattered touchscreen, but screen is fine… keyboard not working, but as all the electronics is in the lcd half, i removed it…

    so now i’ve a baytrail atom 4 core 1.46ghz, with 2gb ram, a decent 32gb emmc, an hdready lcd, and most important of all, an 8000mah battery, which with windows last about 10 hours, so i think with screen off and only using it remotely via ssh and all THE SCRIPT goodies, it should last i think a full day…

    using a microusb hub i connected both a keyboard/mouse combo and an usb dongle with which i installed ubuntu xenial (little problems to deal with EFI Bios, nothing unresolvable with a little of time), script completed in 25 minutes, included tons of updates as the iso was an year old (now we are at 16.04.2, mine was .0, now is .2 after online updates)… it’s fast, it lasts long, it has x64 ubuntu on intel version, i can manage everything, have to try the embedded camera, too…

    well, for something that was going to be discarded… i plan to attach its PSU to a sonoff so my setup can turn it on when battery under 10%, without having to leave it always connected… i’m very happy with it! 🙂

  6. Tested my new SanDisk 16GB Ultra Micro SD Card (SDHC) with h2testw_1.4. Passed 🙂 Bought this from MyMemory in UK but sent from Switzerland. (I don’t like that the new Pi 3 doesn’t have a spring loaded SD slot – makes it more difficult to remove when the Pi is in a case).

    Yet to test the PSU (thanks tkaiser for the process) as literally spent all day trying to get a bluetooth speaker working from the CLI. Just sussed it 🙂 Jessie has BT built-in but obviously not complete as it only worked via the GUI.

  7. Yes, this thread does get very technical but if at the end of the day it can come up with some simple changes newbies can implement to minimise risk, then it will be very worthwhile 🙂

    I’ve just received my 2nd Pi (a model 3 this time) and another SanDisk 16GB Ultra Micro SD Card (SDHC). What I also ordered was the Official Raspberry Pi 3 Universal Power Supply. At £7 why buy anything cheaper.

    In this thread there are many comments about using quality PSU. Any opinion on these RPi labelled PSUs?

    PS. my 2 year old official PSU was by Stontronics (model DSA-13PFC-05 FCA). The latest is by RS Components (model DSA-13PFC-08 FCA). Both cables marked 18AWG. One was bought from Pi Hut and the other Pimoroni , both seem to be leading Pi retailers, so hopefully no fakes.

    1. The official RPi 3 PSU is a good choice but why trusting in blindly. Simply connect as much USB peripherals as possible and run a ‘stress -c 4 -t 600’ and then check the output from

      perl -e “printf \”%19b\n\”, $(vcgencmd get_throttled | cut -f2 -d=)”

      If there’s a ‘1’ on the left you suffer from powering problems and when the next digit is also ‘1’ then overheating also occurs. It would be so easy for Raspbian guys to call the above on shutdown and redirect output to eg. /var/log/raspberry-health.log, check this as part of a boot script and warn users if either/or happens so they can act on accordingly (improve power supply, add a heatsink).

  8. Thanks to all for feedback on this subject – this is great and it really much in line with what I’d hoped for this blog when I started it as a simple reminder to myself as to how I’d made things work (I have a memory like a sieve).

    Here we have some great collected information about using SDs from which I’m sure others will benefit. I have already implemented the RAM disk for logs – working a treat though I think I need to check it saves them on power-down as well as on a timed basis – and I now know of some dead routes NOT to go down. I’m sure the discussion will continue – and once I’m settled down in Spain next week hopefully will start to make use of this. I’ll be taking a little time out to install and test my new 250w solar panel and the electronics that go with it – I’ll cover that on the blog.

    For now however – thanks to everyone for their continued input of knowledge and enthusiasm.

    4 day trip coming up much of which is broadband-free.

  9. The script on megakemp.com is unfortunately neither suited for RPi 2 nor 3 since it does NOT report the real clockspeed of the CPU cores.

    For whatever reasons the RPi kernel is lying wrt /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq in throttling situations. Your RPi 3 without a heatsink in a tiny enclosure might throttle down to 601 MHz and this script (or almost EVERY OTHER script available on the net) will still report 1200MHz. Only if throttling reaches 600 MHz then the real clockspeed will be reported (same with RPi 2 but there’s 600 vs. 900 MHz)

    I sent them a comment on github and hope they update the script and use ‘vcgencmd measure_clock arm’ instead to report the real clockspeed and not random numbers.

  10. Notes as I go along… with proper supply and decent leads I see continuous zeroes on TKaiser’s test… it may or may not be relevant but the Pi which had an issue with SD – has power supply under-voltage issues. Meanwhile H2testw just passed my new Amazon Sandisk ultra 16GB with flying colours.

    1. Good to hear! I always make another test after getting new SD cards and that’s running the iozone call from here (the ‘standard performance test’as i would call it): https://forum.armbian.com/index.php?/topic/954-sd-card-performance/

      Sadly Raspbian doesn’t ship with iozone installed and there’s also no iozone3 package available (this is really sad since people then start to use hdparm/dd which produce just numbers without meaning and do not even test the most import numbers: random read and especially write performance).

      But there’s always a simple script around the corner, just use Jeff’s curl call from here: http://www.pidramble.com/wiki/benchmarks/microsd-cards

  11. Love this whole topic. I also beleive that a system should stay up and running for years (decades) if it doesn’t need to be rebooted (firmware upgrades/patching or power shutdown).

    I ran into the USB cable issue when I found I could charge my phone or tablet in about 45 minutes (from dead to full) with the cable provided with the 10AHr battery. I’ve has original Pi 2s that blew up the SD (didn’t know about the firmware bug, good to know) and want more than anything to limit what is written to the SD (rsyslog *.* @mozart.uucp:514). I just read the above and want to thank everyone for sharing. I now have to reread and take notes. 🙂

    Pete, let me see if I can figure out the NFS root for the Pi 2. I’ve got a spare with a broken SD card holder. I can use that one and tape the card on place. I won’t be able to test that right away though.

  12. GUYS – thank you all for your contributions here but can I ask a favour. The reason I put this blog entry up was to make accessible in a single place a discussion that helped us move forward – and especially those who don’t understand the issues – sadly some of these responses are very “Geek Only” making large assumptions about knowledge and using many abbreviations.

    Can we keep it simple where possible ( I know that some of the problems are NOT simple) – there are a couple of paragraphs in the above responses which may as well be in Chinese for a lot of readers.

    Thanks again for all the feedback.

    1. Pete, (with respect as it is your site), why not let the “Chinese” flow?
      I for one, always feel privileged to read “Deep Geek sharing” on your blog and appreciate all who take the time to post.
      If it’s something I haven’t seen before (Great!), I really like the links so I can begin to translate the “chinese” for myself” 🙂

      Howabout when things settle down, just summarise the main findings e.g.
      – Use a better power supply (e.g. link)
      – Use larger SD cards
      – Check for bit settings (which record power failures)
      – Consider use of external storage with Storage (EXT?) setup on this device (USB memory?)
      – Change the script to output logs etc to external memory
      – Other…

      In the end, the objective is the same thing really – to stop a Pi crashing with memory problems?

      1. Ok you’ve talked me into it – don’t anyone complain I didn’t try to reduce the jargon – as for that BIT – I’m not convinced, waiting a response from TKAISER. with the mouse left connected I get that “1” despite VERY little power voltage difference – I’m wiring up my 3 amps 5v linear supply right now to try bypassing the usb connector to test that further. More here as I go on.

        1. Where do you got this power meter with Micro USB in and Micro USB out? If you use an usual ‘USB Charger Doctor’ or stuff like that you’re measuring at the wrong end of the cable. The voltage drop is the result of cable and contact resistance so you would have to measure on testpoints TP1/TP2 on the RPi itself (most likely GPIO pins 2 and 6 will also do while you generate some load with ‘stress -c 4’ or something like that)

          1. Hi Tkaiser….. I’m measuring the voltage on connectors 6 (ground) and 4 (5v) on the Pi itself, I’m using a thick cable from a professional linear PSU. The power is also going to pins 6 and 4 for the purposes of this test.

            1. Sorry – to complete – and using a decent digital multimeter – not one of those cheap and nasty “USB Charger Doctors” – which I must say ARE good for weeding out really rubbish cable.

              For the purpose however of validating this continuous readout of frequency and temperature of yours – which I have running in a separate window – I am not seeing an indisputable tie up between frequency and voltage OR frequency and temperature, despite testing voltages AT THE PI2 of 4.9 at one extreme and 5.3 at the other.

              So help me out here – what can be accounting for the fact that there seems to be no clear correlation?

              1. Now tested on and adopted for RPi 2 too: https://pastebin.com/t5vTfSPa

                To get results that seem not that weird (caused by so called ‘cpufreq governor) switching to ‘performance’ might make some sense: echo performance >/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

                And then it only gets interesting comparing the way you power the Pi now and back through Micro USB.

      2. It’s not ‘better power supply’ since the problem is the cable between power supply and the board. You can use a 3A rated PSU and will get in trouble for sure if the USB cable used has a resistance too high (and that applies to 99.99% of these cables since they were designed for 500mA max!) But you can use an 1A rated PSU with a good cable and are fine.

        And that’s the only reason those ‘good’ power supplies all provide not a type A receptacle but a fixed cable and provide 5.1V at least since the real problem also 99.99% of Pi users are unaware is called voltage drop (while anyone talks about the power brick and amperage there). Once you fixed undervolting and learned how to avoid crappy SD cards almost everything is already fine 🙂

        1. Yes appreciate comment about power supply – what I’m using is a lab supply with thick speaker cable and measuring the voltage at the Pi2.

          1. Nice to have a test script. I tried this on a Pi2

            The first half were predominantly taken with the voltage at 4.95v – the second half at 5.2v – as you can see, no-where near conclusive – so to recollect this is a power supply feeding pins 4 and 6 on a Pi2 (with nothing else connected other than hardwired ethernet).

            The voltage generally goes up but the frequency.. well, see for yourself.

            So to see if there was noise involved I tried putting an electrolytic across the pins and even a supercap… no significant difference.

            39.5’C 600 MHz 1010000000000000000 1.2V
            40.1’C 600 MHz 1010000000000000000 1.2V
            39.5’C 600 MHz 1010000000000000000 1.2V
            40.1’C 900 MHz 1010000000000000000 1.3125V
            40.1’C 600 MHz 1010000000000000000 1.2V
            39.5’C 600 MHz 1010000000000000000 1.2V
            39.5’C 900 MHz 1010000000000000000 1.3125V
            40.1’C900/600 MHz 1010000000000000000 1.3125V
            39.5’C 600 MHz 1010000000000000000 1.2V
            39.5’C900/600 MHz 1010000000000000000 1.3125V
            39.5’C 600 MHz 1010000000000000000 1.2V
            40.6’C 900 MHz 1010000000000000000 1.3125V
            40.1’C 900 MHz 1010000000000000000 1.3125V
            40.1’C 600 MHz 1010000000000000000 1.2V
            40.1’C 600 MHz 1010000000000000000 1.2V
            40.1’C 600 MHz 1010000000000000000 1.2V
            40.1’C 900 MHz 1010000000000000000 1.3125V
            39.5’C 600 MHz 1010000000000000000 1.2V
            40.1’C900/600 MHz 1010000000000000000 1.3125V
            40.1’C 600 MHz 1010000000000000000 1.2V
            40.1’C900/600 MHz 1010000000000000000 1.3125V
            39.5’C 600 MHz 1010000000000000000 1.2V
            40.1’C900/600 MHz 1010000000000000000 1.3125V
            40.1’C 900 MHz 1010000000000000000 1.3125V
            40.1’C 600 MHz 1010000000000000000 1.3125V
            40.1’C900/600 MHz 1010000000000000000 1.3125V
            40.1’C 600 MHz 1010000000000000000 1.2V
            40.1’C900/600 MHz 1010000000000000000 1.3125V

            So to iron out power supply issues I scrapped all that and tried a RAVPOWER battery – this is a (genuine) 26AH battery pack able to put out 2.1 amp usb – into a lead I bought for charging my ridiculously high power phone – and it does a better job than other methods.

            Here are the readings…. RPI2 using new Raspbrian – nothing plugged in except Ethernet. Voltage read at pins 4 and 6 – 5.03v

            To stop simply press [ctrl]-[c]

            41.2’C 900 MHz 0000000000000000000 1.3125V
            40.6’C 600 MHz 0000000000000000000 1.2V
            40.6’C 600 MHz 0000000000000000000 1.3125V
            40.6’C 600 MHz 0000000000000000000 1.2V
            41.2’C 600 MHz 0000000000000000000 1.3125V
            40.6’C 600 MHz 0000000000000000000 1.2V
            40.6’C 600 MHz 0000000000000000000 1.2V
            41.2’C 900 MHz 0000000000000000000 1.3125V
            40.1’C 600 MHz 0000000000000000000 1.2V
            40.1’C900/600 MHz 0000000000000000000 1.3125V
            40.6’C 600 MHz 0000000000000000000 1.2V
            40.1’C 600 MHz 0000000000000000000 1.3125V
            40.6’C 600 MHz 0000000000000000000 1.2V
            40.6’C 600 MHz 0000000000000000000 1.2V
            42.2’C 900 MHz 0000000000000000000 1.3125V
            40.6’C 600 MHz 0000000000000000000 1.2V
            41.2’C 600 MHz 0000000000000000000 1.2V
            41.2’C 900 MHz 0000000000000000000 1.3125V

            So – I know you tried this on a Pi3 (I don’t have one handy – it is running the house) but on a Pi2 I am not seeing an great correlations here – the only difference being that on the battery you’ll notice all zeros….

            Those zeros were consistent over several power cycles..

            I then took the SAME short lead to the Original power supply

            44.4’C 900 MHz 0000000000000000000 1.3125V
            44.4’C 900 MHz 0000000000000000000 1.3125V
            44.4’C 900 MHz 0000000000000000000 1.3125V
            43.3’C 900 MHz 0000000000000000000 1.3125V
            42.8’C 600 MHz 0000000000000000000 1.2V
            42.8’C 600 MHz 0000000000000000000 1.2V
            42.8’C 900 MHz 0000000000000000000 1.3125V
            42.2’C 600 MHz 0000000000000000000 1.2V
            42.2’C900/600 MHz 0000000000000000000 1.3125V

            back to the ORIGINAL supply with the longer (2ft) lead

            44.4’C 900 MHz 1010000000000000000 1.3125V
            44.4’C 900 MHz 1010000000000000000 1.3125V
            44.4’C 900 MHz 1010000000000000000 1.3125V
            44.4’C 900 MHz 1010000000000000000 1.3125V
            44.9’C 900 MHz 1010000000000000000 1.3125V
            43.3’C 600 MHz 1010000000000000000 1.2V
            42.8’C900/600 MHz 1010000000000000000 1.3125V
            42.2’C 600 MHz 1010000000000000000 1.2V
            42.8’C900/600 MHz 1010000000000000000 1.3125V
            42.8’C 600 MHz 1010000000000000000 1.2V
            42.2’C900/600 MHz 1010000000000000000 1.3125V

            So – conclusions so far…

            Short lead important to the binary errors shown but makes NO apparently difference to frequency or internal voltage.

            Smoothing cap at Pi end makes no apparent difference.

            Actual voltage going in makes little difference from 4.9 to 5.2v

            1. The problem is that your test setup also has both cpufreq scaling and ‘dynamic voltage frequency scaling’ active. Which can be confusing to say the least. If you do a ‘cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor’ it will most probably be ‘ondemand’. And this leads to a kernel driver switching between 600MHz/1.2V (idle) and 900MHz/1.3125V (some or heavy load).

              As soon as you switch to performance it starts to get interesting since then RPi should remain at 900MHz/1.3125V all the time. With the crappy cable it will now start to drop down to 600MHz/1.2V even under full load while with your good bench PSU it will stay at 900MHz/1.3125V. The monitoring script’s output is only interesting when there’s some load on the RPi (eg. ‘stress -c 4’) or some power hungry USB peripherals.

              And when powered through USB you’ll see ‘1010000000000000000’ switching to ‘1010000000000000101’ and should also be able to measure the voltage drop on the GPIO pins while you’ll also see that max internal voltage will be limited to 1.2V and cpufreq to 600MHz even if there’s a task like stress hammering the CPU cores.

              Damn: I’ve no idea how to conveniently switch cpufreq governors with latest Raspbian (am using an OpenMediaVault build right now where it’s ‘cpufreq-set -g performance’)

              1. Thanks for this TKaiser – it is all slowly starting to dawn on me – and hopefully through this dialog on others.

                So I was on the wrong track looking for speed changes related to power. That is set by load – I think the next job is to find the easiest way to set the pi to run at full power all the time unless it is overheating – which is what I THOUGHT it was – but it isn’t.

                The relationship between average voltage and errors is not at all clear but the relationship between lead IMPEDANCE (resistance is good enough) is very clear.

                My layman’s version of this suggests that while all my solutions generate enough voltage, the nature of Pi requirements appears to be that it has high instantaneous peak demands that are not met with thinner leads and/or inadequate power supplies…. and now we have a simple test.

                What we still don’t seem to have is the ability to RESET those error test bits – eight now one glitch would trigger them – and so we’ve no idea HOW bad things are. Fixing that so that, say, Node-Red could be continually testing power quality – would indeed be a nice step forward.

                1. It’s all there, we have an obscure reading of 19 bits, the 3 on the right side describe actual status, while the 3 on the left are set to 1 once the indiviual trigger has been fired:

                  The bits on the right are:
                  0: under-voltage
                  1: arm frequency capped
                  2: currently throttled

                  And corresponding on the left:
                  16: under-voltage has occurred
                  17: arm frequency capped has occurred
                  18: throttling has occurred

                  So if you see ‘1010000000000000000’ then 0 and 2 were set to 1 in the past (but are fine right now) so 16 and 18 remain at 1 until next reboot. After my test an hour ago though I would say the explanation for 1/2 and 17/18 should be exchanged since bits 1 and 17 were only set to 1 after I put a pillow on poor RPi 3 to heat it up above 80°C.

                  And if thermal throttling occurs the numbers get really weird since then maximum cpufreq will still be reported (900 MHz on RPi 2 and 1200 MHz on RPi 3) while in reality the clockspeed is already much lower. That’s close to fooling users 🙁 But fortunately that’s an RPi 3 only problem since it’s really hard to get the other Raspberries to overheat.

                2. Impedance as an electrical engineering term includes inductance/resistance/capacitance.

                  Power leads of USB cables have an inductance of about 0.5 … 1 uH (microhenry) per meter. Thick or thin cables doesn’t matter, as long as the ratio of copper to insulation diameter is constant (thicker wires tend to have thicker insulation). Short cables have a low inductance, thick cables are about the same as thin ones.

                  High inductance hinders the power supply to quickly react to changes in current draw (well, the power supply does react, but the increased current reaches the load to late).

                  If there are long wires, bypass capacitors are important. The increased current (an *only* the difference to the former current) has to come from the capacitors, which will slowly discharge, and recharge again when the increased current reaches the load. If the capacitors are to small, their voltage will have dropped to much.

                  There is also a difference between the first RPi1 and almost every other board – the RPi1 used an LDO, while modern boards use synchronous buck converters. The RPi1 LDO has a dropout of about 1.0 … 1.2 V at 500 … 800 mA. As soon as the input voltage drops below 4.5 V, the output voltage drops as well. This is especially important for the SD, which typically needs 3V3.

                  The RPi1 A+/B+ and any later uses a synchronous buck converter, which can deliver 3V3 even at an input voltage of 3.8V (although the USB ports will be out of spec). The Pine64 is able to run at 3.7V and less (it is designed to be powered from a single LiPo/LiIon cell), and when you select the “BAT”tery mode, even the 5V rails are powered correctly from the step up converter on the board.

    2. I’m glad this is actually being seen to help people – because that is why I made this a separate subject. We are on the verge of having a set of requirements to maximise the life of SDs and minimise power-related issues – all in one long conversation! Very good. Much of this has been discussed but SO many articles out there leave one wondering.. and not a lot further forward. I am as of right now throwing away long USB leads – and testing my new SDs with H2TestW. If we come up with a way to reset those flags which pick up power errors, we’ll have the ability to constantly monitor issues with Node-Red for example – and report by email or whatever any issues – add to that the log2ram program (recall I read all the way through ramlog docs only find out it (apparently) no longer works for Jessie – well log2ram does because I’m now using it – about to test it on other boards).

      1. Of course you don’t use wear-aware fs on media which does the housekeeping itself. Since those filesystems designed to deal with mtd devices (those without a controller) will lead to faster wear-out since two instances trying to optimize at different layers are effectively fighting against.

        It’s the same with trying to ‘optimize’ filesystem by running defragmentation software on SSDs: not a single advantage but decreasing lifespan.

  13. Interesting idea on using RAM. I sell an industrial data logger and we used Windows XP Embedded (now using Windows 7 Embedded) and compact flash. We use the EWF option which means that all disk use during operation to the C: drive is in RAM. For data we store this on a D: drive compact flash. We don’t continually write to this.
    After some 10+ years these machines are still in operation with only the occasion corruption to the OS disk (usually due to large power spikes). If we want to make changes to the C: drive, there is an option to commit changes if you install any drivers etc. If you switch it off by doing a shutdown or pull the plug, nothing happens to the C: drive.
    Someone once tried to update their machine by installing standard Windows XP to the compact flash. It last about 2 months before the disk was corrupted and read only. 🙂
    Granted our use is down to single application use and this method fits well, but I think this would work fine with the embedded devices we use and reduce the risk of corruption or even just power loss.

    1. Similar approach with Linux based embedded systems is also easy (UnionFS for example, but I prefer on the various ARM boards I use running Ubuntu Xenial with overlayfs since integrated nicely in Armbian). Adding a 2nd disk for real data with controlled behaviour (as you said ‘don’t continually write to this’) and wear-out is close to zero.

      But in my opinion it’s absolutely OK to let Raspbian or other OS run in normal mode off a quality SD card or other flash media as long as those writes that come with insanely high write amplification are handled correctly ((those ‘just a few bytes every few seconds’ logging stuff mostly — stuff for log2ram, folder2ram and friends).

      Since all storage media will eventually die and neither HDDs, thumb drives nor SD cards feature a ‘health forecast’ (on good SSD you query the controller’s estimate eg. SMART attribute 233 ‘Media_Wearout_Indicator)’ on Intels or 177 with Samsung — unfortunately vendor specific) we simply try to use btrfs whereever possible since as soon as monthly ‘scrubs’ comparing checksums with real data indicate data corruption starting it’s time to replace the media. ReFS in Windows scrubs by default too but I doubt it’s available for Win 7 Embedded 🙁

  14. Great topic and good discussion. We live in an imperfect world and that’s the way I tend to look at all things computing. I mean -its all going to go wrong one day.

    Whilst we could search for the ‘Holy Grail’ of reliable storage, it still doesn’t exist, least of all for cheap SBCs.

    The approach I take is to run DietPi with only the software I actually use. All of my data and scripts are written to a USB drive and the whole system backed up regularly to a 128Gb USB drive.

    Only yesterday, I had a SD fail – due to my mis-timed brutal power down technique. Later, it was back up and running by plugging in a new dietpi card and restoring everything. The data was untouched.

    ** Other Strategies are available 🙂 **

    Good weekend everyone

  15. The answer there is “it depends”.

    Higher end consumer and the upper half of the “industrial” SD card space will be very reliable. These devices have superior controllers for the flash and the wear leveling will be nearly be or be on a par with the mid-end USB 3.0 sticks out there. Even on USB 2.0 with the Pi 3.

    USB sticks, except for the cheapest will be trending towards higher quality, higher IOPS (Which is the driving force with storage… What good is 90MB/s write speed when it can only muster 100 IO transactions per second? That’s the fails with Patriot’s products. If you’re doing large block linear transfers read or write, they work as advertised, but do random small block access? Heh…), and better wear leveling. A USB stick will invariably be able of speeds up to 150 MB/s on a USB 3.0 interface and easily muster as much as 40-45 Mb/s on a USB 2.0 link. (It is worth noting that a Pi3 can and will boot off of one of the USB ports with a good USB stick)

    SSD is an option, but you’re still talking USB for the path there. It *SHOULD* be noted that WDLabs’ stuff doesn’t offer “SSD” directly. PiDrives are spinning rust stores. The closest thing there is the 2.5″ SATA to PI CM adapter board. But that entails buying a CM3 module, etc. I know for a fact you can get 50-ish MB/s out of a mSATA or NGFF M.2 to USB3 adapter on a Pi3 and you CAN boot from it just like you could a USB stick. Peak endurance for a project, obviously, will be this route, but you might be better served with a differing ARM board that does SATA correctly there.

    SPI flash does not have wear leveling and is limited to the speed of the SPI controller, so people, please…it works for a small subset of applications and you’re more constrained and worried about endurance than SD’s.

    eMMC’s a possibility in the same space/context as SSD (since it’s largely the same beastie in a link managed by an SD/MMC controller with as much as 8 lines for data , which makes it roughly as fast as most USB 3 SSD solutions would be. However, it’s only a possibility for the boards that offer it. NOT for Pi’s.

    1. ’50-ish MB/s’ over BCM283x’s single USB connection, really? BTW: I mentioned SPI NOR flash above not as a storage alternative but just as an option to flash an universal bootloader once to overcome the need for an SD card so more SBC can do what RPi 3 can do now: booting directly from network or USB storage 🙂

    2. Found it again. Those guys here claim write speeds to a SSD on a Raspberry of 58.7 MB/s: http://www.pi2design.com/store/p1/502SSD_-_mSATA_Solid_State_Drive_Shield.html

      Which is just the usual ‘benchmarking gone wrong’ (linux kernel buffering data in RAM) since 53.248 MB/s is the theoretical maximum and with USB’s Mass Storage Protocol adding a bug real speeds between a single host and device are way lower: http://heise.de/-325842 (sorry, german only)

      Anyway: since I will reference this thread/discussion I took the opportunity to do a quick test with four devices:
      – a rather fast SSD (Intel 540, 120GB)
      – an USB3 thumb drive (SanDisk Ultra Fit V2, 128GB)
      – an ok-ish SD card (Samsung EVO+, 128GB)
      – a random ‘notebook HDD I found in the drawer’ device (Seagate Momentus, 60GB)

      Tested on an RPi 3 running Raspbian Lite (OMV) with iozone ver. 3.429 and the following switches ‘-e -I -a -s 100M -r 4k -r 16k -r 512k -r 1024k -r 16384k -i 0 -i 1 -i 2’. The SSD and HDD were behind a JMicron JMS567 (which is a pretty fast USB-to-SATA bridge). I tested the HDD on the outer tracks (almost empty performance) and also on the inner (almost full performance):

      http://kaiser-edv.de/tmp/lGtv38/rpi3-storage-performance.txt

      All numbers in KB/sec so to get IOPS one has to divide through blocksize.

      The HDD gets bottlenecked on the inner tracks due to zone bit recording and being really slow/old. And shows strange random write behaviour (way to high numbers but too lazy to look further into).

      In the context of ‘avoiding writes wherever possible’ those numbers are irrelevant anyway but to do an educated guess if the device has to store data they might be helpful… At least I wouldn’t attach spinning rust any more if a large enough SD card or thumb drive is available (again: beware of counterfeit cards/devices!)

      BTW: I used iozone to be compatible to both Jeff’s great compilation here http://www.pidramble.com/wiki/benchmarks/microsd-cards and community driven benchmark efforts there: https://forum.armbian.com/index.php?/topic/954-sd-card-performance/

  16. Actually until now I may have missed the point about using a USB memory stick.. If you took temporary stuff and logs and put them on the stick, you are if nothing else reducing the writes to the main SD. I could give two hoots if I lost logs. On the other hand I’m looking at the log directory on the machine I’ve left in Spain and in total the logs only come to something like 15MB. I don’t know about others but I’ve used logs after power up to see if anything went wrong – but I have NEVER looked at logs more than a couple of days old.

    1. The problem especially with logs is not the amount of data you think of but that the real amount of writes to flash media is MAGNITUDES higher with default settings.

      The default commit interval of ext4 as used normally is just 5 seconds. Every now and then one of the many daemons on your machine do decide there’s something log-worthy and eg want to write stuff like that (me logging into my one of my small Lime2 servers):
      ‘Apr 6 15:29:30 localhost systemd[1]: Started Session c61 of user tk.’

      That’s just 70 bytes and should now be appended to /var/log/syslog in my case within the next 5 seconds. But… not 70 bytes are written but the amount of what will be reported by the following as “Block size”: stat -f /var/log/syslog

      If this is 4096 then the filesystem change will be exactly that. Since flash media won’t overwrite data the real amount of writes to flash cells can be significantly higher (now we would have to talk about page sizes and also Erase Block sizes especially if the SD card is already full).

      TL;DR: writing every 5 seconds just a few bytes to flash media or only every hour the few KB that accumulated in this time can make a difference of 1 to 1000 or even higher. That’s why it’s important to take care of /var/log (but extending the commit interval from 5 to 600 seconds also greatly reduces wear-out already)

      Some basics wrt what’s happening inside the FTL: https://flashdba.com/2014/09/17/understanding-flash-the-flash-translation-layer/

      1. I would avoid ext4 on an SD card, it is not designed for such device.

        Openwrt does not mostly use ext4, mostly jffs2.

        I have had broken routers with deffect sectors that jffs2 could not deal with, someone said ubifs could deal better with those.

        Any thought on filesystem choice?

        1. Yeah, please don’t advise filesystems made for **raw** flash devices to be used on more intelligent flash media that come with a FTL (‘flash translation layer’ AKA controller doing the housekeeping and logical to physical mapping).

          On routers with raw NAND (or some SBC like the C.H.I.P. for example) jffs2 or ubifs are necessary but on SD cards or eMMC the controller inside deals with this stuff. It’s perfectly fine to use ext4, f2fs, btrfs or any other fs where you can limit ‘write attempts in a certain amount of time’ on flash media with FTL (SD cards, thumb drives, eMMC and the others).

      2. If you look at the modified blog – I’ve included a link to LOG2RAM (not the original but one with instructions) – I’ve tested it on a Pi2, remembering in my case to take note of their reference to Apache without which Apache won’t start if you have it set up)… so that is one tiny piece of the puzzle solved without too much heartache.

        Your comments about cables etc are also good advice – and I should know better. Some time ago I bought one of those cable testers with a LED display and on tested on my phone and tablets. I was horrified at the difference in voltage drops from one to the next – so much so that several of my thin, pretty white cables ended up in the bin… but I didn’t think to transfer that knowledge to wiring up the Pi.

        So my simple takeaways from this discussion so far, easily implemented by all..

        1. Implement LOG2RAM
        2. Test if possible or otherwise replace USB leads with decent thick wire (not thick plastic – and that’s another discussion as some of the thick looking cables are no better than the thin ones).
        3. Use the software you’ve mentioned to test SDs when you first buy them
        4. Only use good quality SDs – including Sandisk and Samsung (though in my case it is a Samsung Evo that went down but any of the above problems could have been the cause).

        SIMPLE additions to that list welcome.

      3. Have you got any instructions on how to increase the var/log commit interval. I am having trouble finding anything other than the RAM drive idea.

        Thanks

        1. The ram drive is for temporary storage – you define the period for updating the SD from the Ram drive – could be hourly, daily and of course on shutdown.

        2. depends on where you put the script, as in /etc there are many folders:
          cron.hourly
          cron.daily
          cron.weekly
          cron.monthly

          which should be self explainatory… if this is not enough, then find whatever guide on how cron works and add your custom schedule to /etc/crontab or your own crontab

  17. I worked for FusionIO and SanDisk so I know a few things about flash memory.

    The bad news: as density (bits per flash chip) has increased, the number of writes (how many times a flash memory cell can be changed) and endurance (how long the cell will retain a 1 or 0) have decreased. Writes (or more precisely program / erase cycles) are down in the 1000s. So how is flash even usable? Clever controllers (wear levelling, etc.) and huge capacity (when you have billions of cells, even if each can only be written 1000 times it will take a long time to write write every cell 1000 times).

    As long as you’re not doing tons of writing and overwriting, an SD card should have a life in years for Raspberry Pis (and similar boards). So what makes a difference?

    Counterfeit cards – This is by far the biggest problem. The market is flooded with counterfeit cards; inferior designs, poor manufacturing, under capacity (ie. a 8 GB marked and sold as 16 GB), Counterfeiters are good at making their cheap knock-offs look like name-brand cards, and they are good at getting their wares into the supply chain. So buy from retailers that put effort into buying genuine SD cards – since this is usually difficult for consumers to judge, the best you can do is to buy from big retailers who are more likely to have the manpower to keep en eye on their supply chain. When you get a new SD card, run the programs tkaiser suggested to see if you have a counterfeit.

    Extra Capacity – Buy a much bigger card than you need. As mentioned already, if you have more space it will take longer to use up the limited number of times each flash cell can be written. BTW, all brand-name SD cards do wear levelling and other tricks to make their cards last longer; if they didn’t a new SD card would give errors very soon. And essentially USB sticks use the same controller as SD cards so while you might be able to get better read/write performance from a USB stick, you won’t get better life.

    Limit Writes – As discussed above, send your data off to a NSA or the cloud if you don’t need to local on the Pi, move logs and other often changed files to RAM, make your OS read-only, etc. The EmonPi guys have done a lot of good work on this front.

    1. Well, KanyonKris, I’m not sure if I feel better for that new knowledge or WORSE – I have no IDEA the write capability (aside from wear levelling) was that low.

      I’ll bet I’m not the only one who thought you could write to a real location thousands of times and then the wear levelling might move the info elsewhere.

      Stunningly bad. So when we are looking at a Linux installation of 3-4 GB, a 32GB card doesn’t seem so daft after all.

      Thanks to all for the continued valuable insights coming into this discussion.

  18. Another comment from Google+

    Bob George

    The early RPis would reboot if you sneezed near them, and card corruption was a real problem. Some firmware updates and newer models are much more robust. Since switching to using Samsung cards a few years ago, I’ve had only 1 card failure across hundreds if not thousands of power pulls in recent years.

    It’s still a problem, but not anywhere near as pervasive as you’d think given all the articles on the topic. There are some good discussions on stackexchange about what SD card lifespans mean in practical terms, with the consensus leaning towards it not being a problem for years in the average case. If you need something more stable, a $.35 RPi intended for educational purposes is probably not what what you need.

  19. As I know pretty much nothing about SD I’m finding this fascinating. I live where my Pi and SD card reside so a failure will only be a nuisance. That said I always worry that I will wake up one day to a dead Pi so I have a NAS used for all sorts of backup at home which is mounted on the Raspberry Pi.

    It is as simple as a line in /etc/fstab for anybody who has one available

    //192.168.2.***/GoFlex\040Home\040Public /media/networkshare/public cifs username=yourloginname,password=yourpassword,uid=1000,gid=1000,iocharset=utf8,noauto,x-systemd.automount 0 0

    A cron task at startup runs a bash script to “sudo mount -a” but a safe time like 30seconds or so after the startup.

    Probably displaying my lack of knowledge but any database or logging I have set up from node-RED on the Raspberry Pi writes to folders on the NAS. Again I may have my legs slashed off at the knee for saying this but I also schedule cron to dd | gzip the entire disk image.

    sudo /etc/init.d/cron stop
    sudo dd if=/dev/mmcblk0 | gzip > /media/networkshare/public/AlarmPi_SD_BACKUP/SDCardBackup.img.gz
    sudo /etc/init.d/cron start

    The card is not new and is now 75% full making tkaiser’s instructions about extra space on the card echo like a warning to me!

    So does the kind of data I export to my NAS make a jot of a difference beside the regular file system writes that are constantly happening on a SBC’s OS?

    Garry
    *Super content again Pete with top quality feedback, much obliged!

  20. Some things being forgotten in this discussion:

    1) SD !+ SSD
    2) SD were never meant to be continuously written to
    3) SD may be a a block device but Linux is writing bytes (update a few bytes, update the block,, worse on SD)

    There are a number of good suggestions above. Some that address the list.

    Let me join the SD failed after a year Pi 2 and the latest Pi 2). Some of the things I’ve done to reduce writing to the SD is to only boot from the SD but then make root an SSD (or, in the future, a network drive). I’ve been playing with an SSD but power seems to be an issue (or a bug in the options I’m playing with). I’ll take what I’ve learned about and see how this helps.

    One thing I do is to change the rsyslog.conf (or syslog.conf if you have that) to send all messages to another server I have. So no more local logs. I also have network mounts so much can be done across the CIFS connection (I don’t know if that’s the best network FS, need to try NFS also).

    I’m going to look into the read only and the other options (some of which I have). More to follow.

    1. Well, an SSD connected to a Raspberry seems like a waste of ressources given how low IO bandwidth there is (all USB ports and Ethernet share one single USB2 connection to the Broadcom CPU). I would instead choose one of those tiny things instead: https://www.everythingusb.com/mini-drives.html

      Just as a reference but there are other SBC out there that show magnitudes better storage performance and are more suited for tasks that could benefit from a real SSD: https://forum.armbian.com/index.php?/topic/1925-some-storage-benchmarks-on-sbcs/

      Wrt SSD and power requirements. I test around with 4 different SSD in the meantime on SBC: Samsung EVO 840, EVO 750, Samsung PM851 (this is an EVO 840 OEM I found in a customer’s server) and an Intel 540.

      3 of them are perfectly useable host powered on USB ports but the EVO 750 can be used to reset boards. Seems to have an insane peak consumption when connected. Whatever it is, usually connecting it to a board the leds go out and the thing either freezes or reboots 🙂

      1. i took the lexar s45 suggested it that first link, 32gb model (the 16gb costs 1€ MORE than the 32gb one in Italy…), followed this guide to first install on a temp sdcard, enable the usb boot flag in rpi3, and copy all to usb dongle:
        https://www.raspberrypi.org/documentation/hardware/raspberrypi/bootmodes/msd.md

        only mods, the 2 SED lines are wrong, i did them manually because you need to change 2 lines in fstab, not one, and 1 in cmdline.txt… took off the sdcard, rpi3 booted flawlessly from usb, run script (stock+habridge and wiringpi), total time: about 1 hour…

        i think the same process can be done with other distros, now that usb boot is enabled, as everything is to copy from sd to usb and modify 2 files… have to test others… no armbian, as not it does not exist…

        1. Well, I run a slightly modified Armbian 64-bit userland on my Raspberry Pi 3 but we will (hopefully) never officially support Raspberries since they’re a support nightmare (though applies to other beefy boards with Micro USB for power as well).

          BTW: Running ‘the script’ became now part of my usual routine (since I’m fine-tuning Armbian’s build system to produce clean OpenMediaVault OS images) and if storage and network is fast and the OS contains some tweaks the script execution can be pretty fast. I just got this on an ODROID-C2 on Hardkernel’s ultra-fast eMMC and behind a 100Mbps Internet connection (everything installed except WiringPi and habridge):

          Total: 00h:09m:14s Cores: 4 Temperature: 45’c

          1. nightmare or not, they’re what give a start to all this world of little boards… not a case everybody STEALS the “PI” part, as NO ONE is really 100% compatible with original, and many of the chinese cloners do not even create a decent software with gpio and other needed feature to be just eligible to be named PI… the community they have, in invaluable…

            can you share your setup, or some hints on how to get there? I’m very interested, i don’t like standard raspbian, as really no goodies like on dietpi or performances of armbian are there…

            do you think it’s usable from usb, too, as i did now? i think yes, because now my pi has the usb boot flag enabled in firmware…

            thanks

            1. I’m perfectly fine with both Raspberries and Raspbian, I just want to avoid dealing with (even more) reluctant users who don’t read documentation and not even our ‘getting started guide’, then either fail to burn the OS image to SD card (since they don’t follow our recommendation to use Etcher) or suffer from permanent instabilities since power is insufficient (that’s the other truth no one wants to hear). We would have to waste even more time with users complaining about software issues that are in reality only hardware failures. No, thanks.

              Re Armbian on RPi 3: It’s taking a 64-bit image (with 64-bit RPi kernel see here https://github.com/bamarni/pi64 and for mainline kernel there https://github.com/michaelfranzl/rpi23-gen-image ) then throwing everything away except of /boot and /lib/modules and replacing it with the rootfs of an ODROID-C2 or Pine64 Armbian image. And then after a reboot you have to check the output of

              dpkg -l | egrep ” armbian| linux” | egrep “Linux|U-Boot”

              And then do an ‘apt remove/purge’ for all of these packages except ‘armbian-firmware’ and adjust /etc/hostname. But there’s still too much missing or different since Armbian’s semantics are for ARM boards but all Raspberries are VideoCore IV devices that boot a proprietary main OS on the VPU and only later start Linux as an guest OS on the ARM core(s). There’s nothing wrong with it (besides the proprietary nature of this) but partitioning and bootloader requirements are simply too different (so nand-sata-install won’t work too without heavy modifications).

              Maybe this will change sometimes in the future but I wouldn’t hold my breath for (adding new devices to a build system is easy, supporting them and their users is a different story)

              I believe you were talking about https://github.com/ThomasKaiser/install-iot-stuff (haven’t done anything the last days there)?

              BTW: ‘the script’ on a really fast ODROID-XU4 running off the crappiest SD card I own (an old Kingston 16GB) finished in the meantime:
              Total: 00h:54m:01s Cores: 8 Temperature: 67’c

          2. can’t find your comment where you published that repo where you are modifying the script… there were some good tricks i’d like to “borrow back” 🙂
            can you address it to me? thanks

          3. Excellent to hear, TKaiser – nice to know “the script” is getting used. Well, a few days break now, once I get settled in Spain I want to incorporate the latest info in here re: power reliability into some kind of quick Node-Red or similar script which will permanently run in the background and let me know if there have been any power problems and when. More on that later.

            For now I’m busy emptying out my desk and bubble-wrapping everything up in preparation for the long trip tomorrow.

  21. Comments from two guys in Google+

    Daniel Mayer

    I have personally never had an SD-Card fail after having worked before. Some gave me block errors immediately and I was able to return them, but all that ran are still running.

    I got one 2GB class 4 sd card that has been in a Raspberry Pi 1 for over 4 years, it’s dead slow but it works.

    Apart from that I always use class 10 sandisk extreme, and that has been working for over a year in a Raspberry Pi 2 (later 3) as well.

    I have tried cheap mixza micro SD-Cards, but 2/3 broke immediately

    Leo Vendler

    My setup has been running for 3 years now without any degradation: Pi + 32 Gb USB formatted with f2fs, /temp and /var/log moved to the USB. Runs 24/7.

  22. It’s interesting that mostly Raspberry Pi users don’t trust in SD cards at all 🙂

    Reason is partially related to the biggest mistake RPi people ever made: Using Micro USB to power the board. This encourages users to use crappy phone chargers to power their boards with average (read as: crappy) USB cables between PSU and device. Most USB cables have a resistance way to high so even with light loads the voltage starts to drop and then the VideoCore’s SD card controller starts to write garbage to the card resulting in the usual corruption issues on next boot.

    Funnily setting max_usb_current=1 in /boot/config.txt massively increases the risk of SD card corruption due to Ohm’s law (now your connected USB disk is allowed to draw not just 600mA but 1200mA, does this and voltage available to the board drops even more)

    This unfortunate connector is also available for poor performance on RPi 3: https://github.com/bamarni/pi64/issues/4#issuecomment-291425512

    Many of those SD card corruption issues are gone when powering the boards more reliable (GPIO pins 2/4/6) or by using a good PSU with fixed cable or visiting Amazon, searching for ’20awg micro usb’ and replacing the currently used USB cable with a short one with low resistance. IMO issue is explained here in a good way: https://www.loverpi.com/blogs/news/93532993-canakit-2-5a-vs-loverpi-2a-power-adapter-comparison

    1. Wrote ‘available’ but meant ‘responsible’ — bad in multitasking 🙁

      Anyway, if you avoid the many potential sources of problems (counterfeit cards, powering problems on Raspberries, high write amplification — all these problems can be fixed/avoided and need no work-arounds like HDDs/NFS) SD cards are a pretty good choice for data that has to be written. Just choose the right brands and capacities 🙂

      1. So as a compromise between cost and assumed quality – I have been using Samsung Evo 16g – and Sandisk Ultra 16g…. I of course have not checked to see if they are real but this is the first time I’ve ever had a failure. Good choice of SDs and bad luck? Or bad choice of SD? It is the Evo that failed.

        1. Impossible to tell now 🙁

          Based on the symptoms you told I would believe it’s a counterfeit card but usually the fraudsters start with larger capacities.

          BTW: card parameters can be read out and usually those fake cards show weird entries for ‘serial’ and the like. Just search here http://sprunge.us/WAhP for the string ‘### mmc0’ — this is a genuine 64 GB EVO bought a year ago.

          This wall of text is ‘Armbian debug output’ — if users run in problems, we ask them to supply the URL an ‘armbianmonitor -u’ call spits out to get an idea what’s going on. I added logging of these MMC parameters one year ago for exactly that reason: to be able to spot fake cards remotely. Same with the ‘### quick iozone test’ line below. Armbian OS images do a quick performance test on the first boot and if especially the last value (random write performance) is low you get a warning.

          We had one occurence of a ‘good counterfeit’ card that reported somehow correct metadata but failed horribly especially with random IO performance (below 100 IIRC) so it was easy to spot the fake.

          7 years old but still an interesting read: https://www.bunniestudios.com/blog/?p=918

    2. Thoughts worthy of remembering there TKaiser. In the particular example you mention – guess what – I’ve a crappy plug-in the wall supply feeding the Pi. Hmmm I can see a better supply coming on here – especially as I’m now using a Pi3.

      1. I did some testing with the USB cables I had lying around and the results were surprising, the thinnest cable gave the lowest volt drop.

        In the end i bought some ANKER cables and when I bought a Pi3 I also bought the dedicated power supply.

        I have a HP Micro server running Openmediavault and that’s been running off a Sandisk uSD car nearly 12 hours a day for about 5 years
        (I realise now I’ve said it, it will fail tomorrow)

        1. You said it – it will pack in this morning 🙂 Well done – clearly you did something right to get 5 years out of it.. Incidentally – OpenMediaVault – I figured it would be useless on a Pi due to lack of RAM…. does it operate ok? Must do I suppose if you’ve kept it going that long…

          If you are pleases with those cables, want to put a link in here?

          1. I have Openmediavault running on an HP microserver and really I only use it to store and serve media files to my box under the TV running Kodi. The install is way out of date now, it won’t even install updates because it’s so old. I really should do a clean install with the latest version but if it ain’t Broke………

            I was using it to do the things I now do with the Pi.

            I will see if I can find the link to the USB cables, I thought I got them from Amazon but I can’t see them in my order history, maybe my wife is right, I do buy too much cr@p!!

      1. My standard test whether powering is sufficient or not is simply running something heavy and then ask the ‘firmware’ whether undervoltage occured or not. You need ‘vcgencmd get_throttled’ to query the firmware, convert the hex output to binary and look at bit 16. Which does the following in a single line (remove the ‘| cut -c4’ to see all bits):

        perl -e “printf \”%20b\n\”, $(vcgencmd get_throttled | cut -f2 -d=)” | cut -c4

        A lightweight load might be a connected USB disk + ‘stress -c 4’ but I always prefer cpuburn-a53: https://github.com/ssvb/cpuburn-arm

        When the above call after such a load test returns a 0 then you’re fine, with a 1 you’re in trouble since powering is insufficient. 🙂

        1. So – if you then remove any extras (I tried removing keyboard and mouse) should that go down to 0 or do you have to do something to reset that value – all I get is 1 and I have a Pi2 on a decent supply with nothing more than a non-self-powered USB connection and the Ethernet connection.

          1. You need to reboot so the firmware resets the default state to 0. Just to toggle it once undervolting occurs again. Though don’t know whether reboot is sufficient or you even need to do a cold boot (shutting down, disconnect power, power on again).

            BTW: could be so easy if this bit would be checked in a shutdown routine in Raspbian each time it occurs doing an ‘echo “$(date): undervoltage detected on shutdown” >>/var/log/undervolted.log’ and then check on startup count of entries and when exceeding a certain number give the user a hint how to fix the problem.

            On the other hand it would be better to fix the problem in the first place and to exchange this Micro USB joke with a sane barrel connector 🙂

        2. This test you have suggested TKaiser… (AND ALL – ANYONE WITH MORE INFO PLEASE COMMENT)

          perl -e “printf \”%20b\n\”, $(vcgencmd get_throttled | cut -f2 -d=)” | cut -c4

          Produced a 1 on my Pi2 test unit here. I rebooted it and rather than 0, it showed nothing.

          I plugged a mouse in with the power on – no doubt causing a spike – and sure enough when next reading that value it went back to 1.

          Surely it is not beyond someone’s wit to figure out how to reset that… this should be something we’re running constantly in the background – for me – NODE-RED could do this and alert me by email.

          So leaving the mouse in – I rebooted again (sudo reboot).

          The value was 1. I removed the mouse and rebooted again. The value was clear (nothing).

          I put a meter on the power. 4.96-4.97 (reliably). I plugged in the mouse. The value changed to 1. The meter read 4.95 -4.96.

          I rebooted with the mouse in. The “1” was back and the power remained at 4.95-4.96

          SURELY it can’t be THAT sensitive? Are you absolutely SURE that this value indicates power issues??

          1. https://www.raspberrypi.org/forums/viewtopic.php?f=63&t=147781&p=972822#p972790

            (if dom doesn’t know who else? 🙂 )

            I made three quick tests, same 5V @ 1A (!) PSU, first time with a short 30cm 20AWG rated cable and cpuburn-a53. Repeated the same with 1.8m cable (also 20AWG): still no problem.

            Then tried my ‘demonstration device’: an average/crappy USB cable — the one that tought me the undervoltage lesson years ago: http://forum.lemaker.org/forum.php?mod=viewthread&tid=8312&extra=page%3D1

            RPi 3 just booting now fell already below 4.63V: root@raspberrypi:~# perl -e “printf \”%20b\n\”, $(vcgencmd get_throttled | cut -f2 -d=)”
            1010000000000000101

            Bits 16 (undervoltage) and 18 (throttling — this is IIRC RPi 3 only) are both set. So ‘booting’ is already too heavy for this pile of junk (an average 28AWG rated USB cable). And when I now run only slight loads cpu clockspeed will immediately be limited to 600 MHz to prevent freezes/crashes.

            Just open another shell and enjoy the output of ‘while true; do vcgencmd measure_temp && vcgencmd measure_clock arm; sleep 2; done’.

            So they managed to workaround the problem instead of fixing it. And I bet that the problem in most cases is the cable and not the PSU. It’s all just Ohm’s law adding rather high contact resistance of a crappy connector (Micro USB) with cable resistance of average (crappy) cables.

            Those cables Antonio recommended or any other with 20/22 AWG ratings will solve 95% of the problem. Unfortunately powering through GPIO header is both more reliable and problematic (bypasses some protection mechanisms ).

            1. So my tests are… confusing. Given your line which gives out the frequency and temperature..

              On power up I got 900m …. plug in mouse – 600m, disconnect mouse – usually but not always back to 900m.

              So I plugged in a linear PSU capable of 3amps+ variable. I set the output to 5v.

              Results were inconclusive. still fluctuating between 600 and 900. I adjusted the supply so the output at the Raspberry Pi 2 pins was 5.10v – surely enough…. yet I still saw the frequency dropping to 600 sometimes as if I’d made no difference.

              Right now at 5.10v with temperature 40c I’m watching it alternate between 600 and 900, Freezer spray down to 6.7 degrees still sitting at 600 occasionally jumping to 900 so it has nothing to do with processor temperature. Still going back and forth.

              So with supply at the board from 4.9 to 5.3v (no mouse or video plugged in) I am still seeing that speed going from 600(00000) to 900.

              Not making sense to me.

              1. Oh, cpufreq scaling behaviour is a different beast (and behaves differently on Pi 2 and 3 and depends on a lot of other parameters). I just assembled a simple monitoring script:

                https://pastebin.com/raw/M8i0rZbE

                Check example output in the comments section, then simply save and run it with ‘bash ./watch-my-pi.sh’ while performing tasks.

          2. To look at actual under-voltage situation bit 0 should be checked, that’s always the last one: perl -e “printf \”%20b\n\”, $(vcgencmd get_throttled | cut -f2 -d=)” | grep -o ‘.$’

            This will report 0 when voltage is above 4.63 or 1 when below.

  23. I don’t know if this is applicable to your project, but one way would be to set the Raspberry Pi rootfs as read-only after setup, then you could update log and other data on a more reliable platform like a NAS or cloud service. This would require some extra work to find out which processes need to write to the micro SD card, and then redirect that data.

    NAND or eMMC flash should be more reliable than micro SD card. I don’t have hard data but all consumer devices around use NAND or eMMC flash, and can be used for several years (usually).

  24. This doesn’t directly answer your question but it might be an alternative.

    I treat my PI (including its SD card) as being a disposable computer resource.
    I have a synolgogy (DS213j in my case) NAS box with redundant RAID1 disks.and I use NFS on my PI to mount space on the NAS box as /home on my PI.

    If the SD card fails, I just plug in a new one with the same image on and continue as all the user data is remotely mounted.
    I still store logs and stuff on the SD card but nothing I can’t do without.

    1. I too have a Synology – could be a good idea. Care to expand on how you took your existing setup and moved data to the Synology? I can give that a shot and I’m sure others might too if simple enough. It relies on the network but if that fails you’re generally stuffed anyway.

  25. I am using a raspberry pi in my summer kitchen for 4 year. The SD-card is read only:
    $cat /etc/fstab
    proc /proc proc defaults 0 0
    /dev/mmcblk0p1 /boot vfat ro,noatime 0 2
    /dev/mmcblk0p2 / ext4 defaults,noatime,ro 0 1
    tmpfs /tmp tmpfs nodev,nosuid,size=16M 0 0
    tmpfs /var/log tmpfs nodev,nosuid,size=16M 0 0
    tmpfs /var/lock tmpfs defaults,noatime,mode=0755 0

    It is used to collect temperature and humidity, control fans and play radio.
    It works 24/7. The SD-card is the cheapest 2Gb I found 4 years ago.
    The raspberry is situated under the roof so in the summer there are 60C and in the winter -25C

    In conclusion: If you want a long life SD card mount it read only. If you want to store something do it on your internet server or on a local HD

  26. avoiding the question really, I use a USB to SATA adaptor and run the rPI from a hard drive. I have plenty of dead hard disks kicking around so it is only a matter of time before they go dark!

    1. And indeed that might work in some instances. Raspberry Pi 3 and many other boards – RPI2 has to boot of SD – though you can put everything else on HD. This does of course affect SIZE and COST quiet a bit.

      1. An alternative are mini thumb drives. After evaluating a few of them by specs I chose a ‘SanDisk Ultra Fit V2 128GB, USB 3.0 (SDCZ43-128G-GAM46)’. Currently evaluating it — way better random IO performance than spinning rust which is important, also no moving parts.

        We’ve a bunch of ultra slow Kingston SD cards from a customer (sequential performance ok, random write horribly low and this is what affects stuff like apt-get upgrade or installation of ‘The Script’). I let your script run on the Kingston card and it took ~3 hours, then started from scratch, used ‘nand-sata-install’ to move OS to the USB thumb drive (chose btrfs and set ‘compress=lzo,commit=600’ in fstab), now it’s running again and I would assume finishes in less than 30 minutes.

        So this will be the use case for those Kingstons: Bootloader only on it while everything else running off of USB or real SATA drives. Fortunately Xunlong started to add 2MB SPI NOR flash to Orange Pis some months ago and other vendors are following (currently just Olimex and Pine Inc, I’ll have to ping FriendlyELEC about that) so RPi 3 behaviour being able to boot from network/USB comes to other SBC as well in the future and we can avoid SD cards at all 🙂

        1. You or others may correct me but am I not right in saying, despite your comments about speed, that the central plank of this discussion still applies – Thumdrives are surely similar technology and therefore must surely have the same WRITE limitations!

          1. ‘Similar technology’ includes also ‘PCIe attached enterprise SSDs’ in servers, yes. All those use flash instead of magnetic platters but there the similarities end.

            The average SD card is made for sequential transfers only (use in cameras, video recorders or surveillance gear) and this is somewhat different with USB thumb drives, they perform much better with random write patterns (especially when you choose USB3 ones which I do even when use case involves USB2 ports only just to avoid buying slow crap). This will change in the future wrt to SD cards fortunately: http://www.cnx-software.com/2017/02/13/sd-specifications-5-1-to-introduce-app-performance-class-for-random-io-logo/

            To be honest: I don’t fear writes to SD cards since when taking care of a few issues there’s nothing wrong with it. Every flash based media and all HDD will die eventually but that’s what backups are made for (and smart filesystems making use of checksums and snapshots that ease transfer of this type of ‘backups’ a lot — we’re talking about btrfs here again but that’s a bit beyond the scope of this blog post)

            The reasons why SD cards seem to fail constantly (especially on Raspberries) are well known but nobody likes to talk about that but prefers workarounds instead for whatever reasons…

            1. When you say nobody – I like to talk about it – that’s why I started this thread.

              So to take your point – one should be looking for “good” SDs – which needs defining – is there anything wrong with commonly available (and not too expensive) Samsung 16G Evo and Sandisk 16G Ultra or would you avoid one or both… and your point about power supplies is well taken. I personally will be acting on that immediately. Good point also about SWG – I generally look for “thick” cable but if this narrows it down to getting thicker actual copper – so be it.

              1. Well, the relationship of SD card corruption with undervoltage wasn’t obvious in the beginning. We have a fleet of RPi B+ used as surveillance cameras that suffered a lot from ‘broken SD card issues’ which all disappeared after changing power setup (one central linear 24W PSU, passive PoE to the Pi, step-down converter and 5.2V/GND to GPIO pins 4 and 6).

                I personally use Samsung EVO, EVO+ and Pro (MLC instead of TLC flash) and SanDisk Ultra, Extreme Pro and Plus. If ‘endurance’ is really an issue I would go for Samsung Pro (or even Pro+), SanDisk Extreme Plus or those Transcend/Lexar with ‘endurance’ in their name.

                If budget is an issue I would choose Samsung EVO+ with 32 or 64 GB capacity just due to their good random IO performance and price/performance ratio (though no hard endurance numbers available). I’v never bought SanDisk/Ultra myself (all came as part of dev samples from FriendlyELEC or Solid-Run) since they show a pretty low random write performance. Still worth a read: https://forum.armbian.com/index.php?/topic/954-sd-card-performance/

              2. Addendum: great read to understand the problem with ‘average’ USB cables (that were never meant to transport more than 500mA — the undervoltage mess we face on SBC happened for a reason!): http://goughlui.com/2014/10/01/usb-cable-resistance-why-your-phonetablet-might-be-charging-slow/

                Unfortunately other board makers followed and used Micro USB for DC-IN and everything as expected: forums full of complaints about instabilities and so on. New hotness: ASUS Tinkerboard with one of the most powerful SoCs around and using Micro USB. Guess what: forum starts to fill with reports of instabilities…

                Some vendors reacted one way or another. FriendlyELEC ships most of their boards with a superiour Micro USB cable (I would assume 22AWG or even 20AWG rated) and RPi foundation masked the problem by implementing voltage control downclocking on demand.

                Unfortunately you need to call ‘vcgencmd get_throttled’ to get a clue whether you’re affected or not and have to interpret the hex number you get as individual bits (it’s bit 16 you’re interested in): https://forum.armbian.com/index.php?/topic/1665-rfc-using-a20-board-with-armbian-as-powermeter/&do=findComment&comment=13765

                A friend of a friend plays around with a 6 node RPi 3 cluster and complained about mediocre/unbalanced performance. Parsing vcgencmd output showed the problem and I bet all those RPi users suffering from frequent corruption issues have bit 16 set to 1 all the time.

                I wonder why Raspbian folks don’t implement such a check in a shutdown routine that writes a hint when this bit is set and warns the user at next login if this occured 3 times. Instead the ‘low performance’ and ‘SD card corruption’ saga continues and people are not even aware that these problems are related to powering and especially the cable between board and PSU.

        2. SPI NOR flashes aren’t going to be a speed boon- and they’re not wear leveled by themselves.

          It’s not a plus if you’re doing periodic updates, etc…

            1. written in perl though. Which is a nogo for openwrt for example.

              It is so short that it might be worth to rewrite it in plain shell.

  27. sudo chmod +x log2ram*
    sudo mv log2ram.service /lib/systemd/system/
    sudo mv log2ram /usr/local/sbin/
    sudo mv log2ram.hourly /etc/cron.hourly/
    sudo systemctl daemon-reload
    sudo systemctl enable log2ram.service
    sudo systemctl start log2ram.service

  28. Thanks for raising this one Pete,

    like you, I split my time between two countries, with several RasPi SBCs in each house, so I don’t leave anything mission-critical running on a Pi. My best for lifetime so far has been an EmonPi from Openenergymonitor, with their mostly RO file system, which has run without a hitch for nearly two years (but I’m sure it will fail at some point). My worst has been Weewx weather stations at both houses, which seems to eat cards ( although Weewx is definitely my preferred weather application for it’s sheer flexibility). They both write to card once every five mins, and one of them is currently running with some tweaks from the Weewx user group so I’ll see if that improves matters.

    As TK says above, I usually buy over-size cards as it just seems good practice. I’ve also considered trying a Pi with a good quality desktop SSD via a USB adaptor, but haven’t got round to it yet.

    Keep up the good work!

    1. Yes it’s ok when you’re sitting next to the project – another when it is thousands of miles away.

  29. [Meh, blog again ate a long comment and I forgot to copy it to clipboard before 🙁 ]

    So short version this time. Software/settings matter: use ‘noatime,nodiratime,commit=600’ in fstab, use stuff like log2ram as above, use btrfs with ‘compress-force=lzo’ added to fstab whereever possible (requires dual-core CPU or better and kernel 4.4 or above). And make use of zram instead of swap.

    1. So I started with

      tmpfs /var/log tmpfs defaults,noatime,nosuid,mode=0755,size=100m 0 0

      are you saying it should be more like this

      tmpfs /var/log tmpfs defaults,noatime,noatime,nodiratime,commit=600, nosuid,mode=0755,size=100m 0 0

      Log2Ram – the link had code but no description as to how to set up unless I missed something. Someone got a setup link? Yes, it does make sense to use RAM for logs assuming you have plenty of RAM then write them back at some interval. A comment was made questioning whether this OVERWRITES the log or UPDATES the log.

      Care to expand on “zram instead of swap” – as a not Linux expert that means little to me.

      1. The ‘noatime,noatime,nodiratime,commit=600’ options are only suitable for real filesystems living on flash media and not those tmpfs 🙂

        Re log2ram: for what goes where please see first commit: https://github.com/igorpecovnik/lib/commits/master/scripts/log2ram (only tested on Debian Jessie and Ubuntu Xenial and relying on systemd so that should work also on Raspbian too). In case you run still Armbian on your NEO 2 you can have a look there how it works.

        Re zram please simply refer to the wikipedia article. I’m still waiting for my NEO Plus 2 to arrive and will then start an extensive series of tests playing around with lzo vs. lz4 compression and performance behaviour in ‘low memory’ conditions. I’ll keep you updated but this will take some weeks.

    2. One of your entries may have just appeared again – it was marked as spam presumably due to links.

  30. The other great wo to improve lifespan is software. The same amount of data when being written constantly in small 4K chunks or only every 10 minutes in a single attempt makes a huge difference regarding wear-out due to something called ‘write amplification’ and limited number of P/E cycles flash media shows. Details? http://electronics.stackexchange.com/questions/218914/how-long-until-my-emmc-is-dead

    One easy attempt is to extend the ‘commit interval’ so the kernel buffers stuff to be written on disk in memory and flushes it to disk only every n minute, another is to avoid unnecessary writes at all (Armbian eg. chose 10m as default commit interval and to avoid timestamp updates due to this in fstab: noatime,nodiratime,commit=600). Please note that this has some drawbacks as usual.

    Then the log2ram service does the same to /var/log directory, mounts it transparently in RAM and syncs changes to disk only every hour and on shutdown while resyncing contents from disk back into memory on next startup. The downside is if your device crashes you’ll loose up to 9:59 minutes of stuff due to commit=600 and even almost an hour of log contents.

    Log files are an excellent candidate for compression. On devices with 2 or more CPU cores or where kernel 4.4 or above is usable I strongly recommend to use btrfs with at least ‘compress=lzo’ for this. A typical logfile containing 15MB of data will only use 1MB or below on disk. And lifespan increases also by 15 times if it’s only about writing those logs. So if the data you have to deal with is highly compressible btrfs is your friend. We have a small Olimex Lime2 at a customer that started as a temporary replacement for a central syslog server which stores now +40 TB log data on a 2 TB HDD 🙂

    1. Exactly, limiting the number of writes to a minimum is critical for SD card endurance and power loss reliability.

      It is more work than just apt-get’in packages, but for a truly robust setup, have a look at embedded buildsystems like Buildroot. We already have defconfigs for the raspberrypi, and read only rootfs is fully supported.

  31. Log2ram Looks good but it just means it regularly copies the log to disk. Which is good but you’d want to append. I.e. only the last log entries get written to ram. The rest is already on disk. Then the save just appends to whatever is on disk.

    1. By defining USE_RSYNC=true (case sensitive) in /etc/default/log2ram append should happen — no idea why it defaults to false. Feedback welcome.

      BTW: as result of this discussion another improvement we use on productive systems now part of Armbian: https://github.com/igorpecovnik/lib/commit/28316c5a040000680f7574ba621788321a99c0bb

      (since if the filesystem already compresses logs it’s counterproductive when logrotate does the same — unfortunately the compress/delaycompress settings are set as part of Debian/Ubuntu packages so such an ugly hack is needed to deal with it when /var/log lives on a compressed btrfs fs)

  32. First: it always depends, you can get good or bad SD cards, same is true with USB sticks and even eMMC (the cheap consumer grade chips on most SBC can not be compared with industrial grade eMMC — the exception here are Olimex boards for a reason: they use expensive and slow eMMC on their boards that should show better endurance since their target audience are mostly industrial users and not makers)

    Then I would only choose SD cards from those vendors producing a) NAND flash, b) controllers (FTL — flash translation layer living inside the card and being responsible how well the card handles stuff like wear-leveling and performance) and c) assemble both to retail products. So we end up already with just 4 vendors any more: Samsung, SanDisk, Transcend and Toshiba (the latter going out of business now).

    Then I order only through sales channels where returning products is no problem and always test flash media (be it SD cards, USB thumb drives or el cheapo SSDs) immediately after purchase since counterfeit cards are a real issue. Tools are available for free and everything is outlined in detail here: https://docs.armbian.com/User-Guide_Getting-Started/#how-to-prepare-a-sd-card

    Unfortunately there exist no specifications regarding endurance ratings for these kind of products so if I care about maximum endurance I start to search for ‘TBW’ (and then end up most likely with pretty expensive ‘industrial SD cards’). Why are those cards that expensive and how are they able to show better endurance? Simple: they have a much larger capacity in reality but exponse only a fraction of it (when the card is internally 64GB but exposes only 16GB it will last 4 times longer than a card with ‘native 16GB’ using the same technology).

    And that’s the reason I buy SD cards with large capacity if I care about endurance. Even if the whole application and the data might fit into 8 GB I buy 32GB at least (other reason: the cheap 32 and 64GB EVO+ show also excellent performance unlike smaller or larger capacities).

Comments are closed.