[ale] Bad Hard Drive

Jim Kinney jim.kinney at gmail.com
Thu Jun 16 22:07:01 EDT 2011


I think you're right on chipset and determinism.
I would almost bet that a hotswap move then reboot would rename drives in
bus/port order. So pull sdc and replace to be sdg will on reboot become sdc
again.
On Jun 16, 2011 6:03 PM, "Michael Trausch" <mike at trausch.us> wrote:
> It is possible that nondeterminism depends on the SATA chipset, but I have
a
> system in production where the drives are enumerated in the order they
> appear to the kernel. Furthermore... if I pull sdb out (yay hot swap) and
> put it back in, it'll get sdg (sdf is the last one assigned at boot time
in
> this system).
>
> I don't really worry about it, though, because I use the serial numbers,
not
> the node names.
>
> --
> Sent from my phone... a G2 running CM7 nightlies!
> On Jun 16, 2011 4:56 PM, "David Tomaschik" <david at systemoverlord.com>
wrote:
>> On Thu, Jun 16, 2011 at 4:38 PM, Jim Kinney <jim.kinney at gmail.com> wrote:
>>> Sort of but not quite all correct. The hard drive chain was not changed
>>> totally. Non-deterministic for pci-bus devices but still traceable with
>>> lspci. Always non-d for usb (pita)
>>>
>>> Hard drives follow a specific pattern:
>>>
>>> BIOS spills data to sys about drive locations. Bus num followed by
device
>>> num. That doesn't change 'cause it can't.
>>>
>>> If a new drive is inserted at a lower bus num than other drives, it gets
>>> called sda. Move the drive in socket 0 to socket 5 and it now is called
> sdf.
>>>
>>> But so what?! Most distros use UUID anyway so you can move your drives
>>> around between boots and it'll still work as long as the BIOS knows
where
>>> the /boot drive is. Cool thing is error messages will reflect the
current
>>> configuration.
>>>
>>> So if the bad drive is moved from sdb to sdf, on reboot the error will
>>> reflect the bad drive is sdf.
>>>
>>> So as long as drives stay plugged in the same, detection will be
>>> deterministic but the name is not. Remember, empty sockets 0-5 and a
> single
>>> drive in 5 will be called sda.
>>>
>>> Thus by looking for the next to lowest numbered drive will reveal sdb,
> the
>>> failed drive in the OP. :-)
>>
>> Unless, of course, there are udev rules that specify otherwise.
>> Serial # is still the most reliable way to be CERTAIN of what you're
>> pulling. Removing the wrong drive from a RAID can make your day very
>> bad. (I've placed SN labels on the visible end of drives in my home
>> system for exactly this reason. Or paranoia. Or because I like
>> labels. Take your pick.)
>>
>>
>>
>> --
>> David Tomaschik, RHCE, LPIC-1
>> System Administrator/Open Source Advocate
>> OpenPGP: 0x5DEA789B
>> http://systemoverlord.com
>> david at systemoverlord.com
>>
>> _______________________________________________
>> Ale mailing list
>> Ale at ale.org
>> http://mail.ale.org/mailman/listinfo/ale
>> See JOBS, ANNOUNCE and SCHOOLS lists at
>> http://mail.ale.org/mailman/listinfo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.ale.org/pipermail/ale/attachments/20110616/1b54a615/attachment-0001.html 


More information about the Ale mailing list