stateless + nas   40

This DIY NAS In a Box Is Portable, Affordable, and Keeps Your Data Safe
So we've established that the DIY approach is your favorite NAS option, so perhaps this tiny, super-portable NAS in a box would be a good weekend project. More »
Projects  DIY  Customization  Hacks  repurpose  Disks  Repurposing  Top  Nas  Hard_Drives  Storage  from google
december 2012 by stateless
RD5200
ReadyDATA OS 1.0 using ZFS
hardware  nas  zfs  netgear  readydata 
november 2012 by stateless
Synology DS1511+ vs. QNap TS-859 Pro, iSCSI MPIO Performance
I have been very happy with my QNap TS-859 Pro (Amazon), but I’ve run out of space while archiving my media collection, and I needed to expand the storage capacity. You can read about my experience with the TS-859 Pro here, and my experience archiving my media collection here. My primary objective with this project is storage capacity expansion, and my secondary objective is improved performance. My choices for storage capacity expansion included: Replace the 8 x 2TB drives with 8 x 3TB drives, to give me 6TB of extra storage. The volume expansion would be very time consuming, but my network setup can remain unchanged during the expansion. Get a second TS-859 Pro with 8 x 3TB drives, to give me 18TB of extra storage. I would need to add the new device to my network, and somehow rebalance the storage allocation across the two devices, without changing the file sharing paths, probably by using directory mount points. Get a Synology DS1511+ (Amazon) and a DX510 (Amazon) expansion unit with 10 x 3TB drives to replace the QNap, to give me 12TB of extra storage, expandable to 15 x 3TB drives for 36TB of total storage. I will need to copy all data to the new device, then mount the new device in place of the old device. I opted for the DS1511+ with one DX510 expansion unit, I can always add a second DX510 and expand the volume later if needed. As far as hard drives go, I’ve been very happy with the Hitachi Ultrastar A7K2000 2TB drives I use in my workstations and the QNap, so I stayed with the larger Hitachi Ultrastar 7k3000 3TB drives for the Synology expansion. For improving performance I had a few ideas: The TS-859 Pro is a bit older than the DS1511+, and there are newer and more powerful QNap models available, like the TS-859 Pro+ (Amazon) with a faster processor, or the TS-659 Pro II (Amazon) with a faster processor and SATA3 support, so it not totally fair to compare the TS-859 Pro performance against the newer DS1511+. But, the newer QNap models do not support my capacity needs. I use Hyper-V clients and dynamic VHD files located on an iSCSI volume mounted in the host server. I elected this setup because it allowed me great flexibility in creating logical volumes for the VM’s, without actually requiring the space to be allocated. In retrospect this may have been convenient, but it was not performing well in large file transfers between the iSCSI target and the file server Hyper-V client. For my new setup I was going to mount the iSCSI volume as a raw disk in the file server Hyper-V client. This still allowed me to easily move the iSCSI volume between hosts, but the performance will be better than fixed size VHD files, and much better than dynamic VHD files. Here is a blog post describing some options for using iSCSI and Hyper-V. I used iSCSI thin provisioning, meaning that the logical target has a fixed size, but the physical storage only gets allocated as needed. This is very convenient, but turned out to be slower than instant allocation. The QNap iSCSI implementation is also a file-level iSCSI LUN, meaning that the iSCSI volume is backed by a file on an EXT4 volume. For my new setup I was going to use the Synology block-level iSCSI LUN, meaning that the iSCSI volume is directly mapped to a physical storage volume. I use a single LAN port to connect to the iSCSI target, meaning the IO throughput is limited by network bandwidth to 1Gb/s or 125MB/s. For my new setup I wanted to use 802.3ad link aggregation or Multi Path IO (MPIO) to extend the network speed to a theoretical 2Gb/s or 250MB/s. My understanding of link aggregation turned out to be totally wrong, and I ended up using MPIO instead. To create a 2Gb/s network link between the server and storage, I teamed two LAN ports on the Intel server adapter, I created a bond of the two LAN ports on the Synology, and I created two trunks for those connections on the switch. This gave me a theoretical 2Gb/s pipe between the server and the iSCSI target. But my testing showed no improvement in performance over a single 1Gb/s link. After some research I found that the logical link is 2Gb/s, but that the physical network stream going from one MAC address to another MAC address is still limited by the physical transport speed, i.e. 1Gb/s. This means that the link aggregation setup is very well suited to e.g. connect a server to a switch using a trunk, and allow multiple clients access to the server over the switch, each at full speed, but it has no performance benefit when there is a single source and destination, as is the case with iSCSI. Since link aggregation did not improve the iSCSI performance, I used MPIO instead. I set up a test environment where I could compare the performance of different network and device configurations using readily available hardware and test tools. Although my testing produced reasonably accurate relative results, due to the differences in environments, it can’t really be used for absolute performance comparisons. Disk performance test tools: CrystalDiskMark 3.0.1b ATTO Disk Benchmark 2.46 Server setup: Windows Server 2008 R2 Enterprise SP1. DELL OptiPlex 990, 16GB RAM, Intel Core i7 2600 3.4GHz, Samsung PM810 SSD. Intel Gigabit ET2 Quad Port Server Adapter. LAN-1 192.168.0.11, LAN-2 192.168.1.12 Network setup: HP ProCurve V1810 switch, Jumbo Frames enabled, Flow Control enabled. Jumbo Frames enabled on all adapters. CAT6 cables. All network adapters connected to the switch. QNap setup: QNap TS-859 Pro, firmware 3.4.3 Build0520. 8 x Hitachi Ultrastar A7K2000 2TB drives. RAID 6. 10TB EXT4 volume. 10TB iSCSI LUN on EXT4 volume. LAN-1 192.168.0.13, LAN-2 192.168.1.14 Synology setup: Synology DS1511+, firmware 3.1-1748. 5 x Hitachi Ultrastar 7k3000 3TB drives. Synology Hybrid RAID (SHR) 2 drive redundancy. 8TB iSCSI LUN on SHR2. LAN-1 192.168.0.15, LAN-2 192.168.1.16 To test the performance using the disk test tools I mounted the iSCSI targets as drives in the server. I am not going to cover details on how to configure iSCSI, you can read the Synology and QNap iSCSI documentation, and more specifically the MPIO documentation for Windows, Synology and QNap. A few notes on setting up iSCSI: The QNap MPIO documentation shows that LAN-1 and LAN-2 are in a trunked configuration. As far as I could tell the best practices documentation from Microsoft, DELL, Synology, and other SAN vendors, say that trunking and MPIO should not be mixed. As such I did not trunk the LAN ports on the QNap. I connected all LAN cables to the switch. I could have done direct connections to eliminate the impact of the switch, but this is not how how I will install the setup, and the switch should be sufficiently capable of handling the load and not add any performance degradation. Before trying to enable MPIO on Windows Server, first connect one iSCSI target and map the device, then add the MPIO feature. If you do not have a mapped device, the MPIO iSCSI option will be greyed out. The server’s iSCSI target configuration explicitly bound the source and destination devices based on the adapters IP address, i.e. server LAN-1 would bind to NAS LAN-1, etc. This ensured that traffic would only be routed to and from the specified adapters. I found that the best MPIO load balance policy was the Least Queue Depth Option. During my testing I encountered a few problems: The DX510 expansion unit would sometimes not power on when the DS1511+ is powered on, or would sometimes fail to initialize the RAID volume, or would sometimes go offline while powered on. I RMA’d the device, and the replacement unit works fine. During testing of the DS1511+, the write performance would sometimes degrade by 50% and never recover. The only solution was to reboot the device. Upgrading the the latest 3.1-1748 DSM firmware solved this problem. During testing of the DS1511+, when one of the MPIO network links would go down, e.g. I unplug a cable, ghost iSCSI connections would remain open, and the iSCSI processes would consume 50% of the NAS CPU time. The only solution was to reboot the device. Upgrading the the latest 3.1-1748 DSM firmware solved this problem. I could not get MPIO to work with the DS1511+, yet no errors were reported. It turns out that LAN-1 and LAN-2 must be on different subnets for MPIO to work. Both the QNap and Synology exhibits weird LAN traffic behavior when both LAN-1 and LAN-2 is connected, and the server generates traffic directed to LAN-1 only. The NAS resource monitor would show high traffic volumes on LAN-1 and and LAN-2, even with no traffic directed at LAN-2. I am uncertain why this happens, maybe a reporting issue, maybe a switching issue, but to avoid it influencing the tests, I disconnected LAN-2 while not testing MPIO… [more]
performance  mpio  synology  nas  storage  review  iscsi  qnap  from google
july 2011 by stateless
Making sense of the various NAS hardware and software solutions
This past weekend I realized I had a sufficient need at home for some type of centralized storage solution. Ideally this solution would allow me access my data from all of my machines via NFS, CIFS and iSCSI, and have some capabilities to stream music and videos across my wireless network. The number of NAS solutions I found astounded me, and I have been digging through reviews to see what is good.

During my research, I came across a slew of hardware and software solutions. The hardware solutions I added to my list came from various vendors, though I decided to scratch one large vendor (Drobo) after reading Curtis Preson’s blog post about his drobo support experience. Here are the hardware vendors that made it into my possibility list:

- Buffalo Technology
- Intel
- Netgear
- Synology
- UnRAID

In addition to pre-built hardware, I also debated buying a low power system and running one of the following software NAS solutions on it:

- EON OpenSolaris-based NAS distribution
- FreeNAS FreeBSD-based NAS distribution
- NexentaStor Community edition
- OpenFiler Linux-based NSA distribution

Once I had a better feel for what was out there, I decided to pull out my notebook and write down the things that I wanted vs. needed in a NAS device. Here are the items I really wanted to have out of the box:

- Support RAID and drive auto expansion
- Support for NFS, CIFS and iSCSI
- Ability to run a DLNA/UPnP server to stream audio and video
- Easy to use and manage
- Low power consumption
- Extremely quiet
- Built-in hardware fault monitoring
- Well supported organization or community

The synology devices seem to provide everything I’m after and then some, but the FreeNAS and openfiler projects provide a lot of flexibility that can’t be matched by the Synology (e.g., all the source is available). I’m currently leaning towards the Synology DS411J, but I may end up nixing that idea and build a small quiet machine that runs openfiler/freenas. If you have a centralized NAS device at home that meets the checklist above, please let me know in the comments.
nas  dyi  discussion  article 
january 2011 by stateless
First Looks: Thecus N7700PRO
 

Thecus Technology is one of those companies who focus purely on storage.  From Direct Attached Storage devices, Multimedia devices, and from SOHO to Enterprise Network Attached Storage Devices.  Just announced today is the latest member of their Enterprise solutions, the N7700PRO NAS device.

 

With 7 bays of goodness, RAID 0, 1, 5, 6, 10 and JBOD capabilities, there can’t be much to not like.  Combining this array of storage flexibility is the horsepower to go with it:

With the N7700PRO, blistering performance is the name of the game. At its core is an Intel® Core 2 Duo CPU and a whopping 4GB of high-speed DDR2 800 memory, making it the most powerful NAS unit available. In fact, with its PCI-e slot, the N7700PRO can reach data transfer speeds of over 300MB/s by adding a PCI-e 1Gb Ethernet adapter! All of this raw power easily manipulates large amounts of data – perfect for the N7700PRO’s seven 3.5” SATA drive bays that can accommodate up to 14TB of storage. Need even more storage at your disposal? With its stackable feature, you can connect up to five N7700PROs together and easily manage them all via a master unit. The N7700PRO is even compatible with iSCSI initiators and supports iSCSI thin provisioning for added performance and flexibility.

No information on pricing, but since this NAS device is the new top dog in the Thecus product line, it won’t be cheap.

One sweet looking unit.  I have got to get me one of these (items to try out!)

More Info: Thecus N7700PRO
Hardware  nas  news  N7700PRO  thecus  we-got-served  wgs  from google
october 2009 by stateless
Installing a QNAP NAS as homeserver
Running servers at home has two major drawbacks: they generate heat and they consume power. I decided to look into a low power PC like the ones based on an Intel Atom CPU. The can be found in pre-build systems that generally run at clockspeeds between 1.2-1.8Ghz which is good enough for most home-server applications.
In total, such a system consumes around 35 Watt full load, 25 idle. That is about 1/4th of a small server with power savings active, or up to 1/20th of a full featured server. And, they don’t generate all that heat, are much smaller and make less noise.

Being sold as a desktop system, they come with graphic cards and all kinds of non-useful accessories for servers.

Then I remembered my not-so trusty Synology 207 NAS system that is basically a small server. In the year I had it, it failed on me completely because of bad cabling to the drives making me believe I had failing hard drives. In reality, the drives were ok but the SATA and/or power cables gave up on me. I found out, it is a common problem with that model and a new cable set does the trick. Mobile-Harddisk.nl kindly provided me a new set for free, but my trust in Synology was gone. Also, back in the time I decided to get the cheaper model with less RAM, that makes it impossible to let it run any additional software … and that is exactly what I want to do now.
I looked at some reviews and found the QNAP systems provide a great storage system and run on considerable faster hardware. So … I got myself a QNAP TS-219 system …

It’s a server that runs on a 1.2Ghz processor with 512MB RAM, and it basically is a Debian Linux system that provides all kind of services, like Samba for filesharing. It has just about room for 2 harddrives, which is enough for me. Apart from the very long list of features it comes with, it can be extended via the ipkg package system, comparable to APT or YUM. You can install any (text-based) software you like. I can have it download Torrents, it can run my local in-house DNS (bind), it could act as a SOCKS4 proxy (nylon) for my MSN that they block at work.

The guys from QNAP set up the boot environment for the system from a downloadable image from their website. This image, or firmware as they call it, is a prebuilt setup of a working system that is kept read-only on the NAS. During the boot, the system loads this image into flash memory and the specific configuration files for the installed services are copied over from a persistent configuration storage to this flash storage. After the system services are loaded, it loads a chrooted environment and hides the actual system.
The system can be configured to run as you like, but you need to make sure your configuration changes are saved to the configuration storage as the running configuration in flash will be cleared upon a reboot. This makes the system robust, as accidental changes will never be fatal. It makes it also limited, because only the configuration files QNAP decides to copy over will be active after a reboot.

Or … are they?
One of the things you cannot change by default, is the init.d setup used for starting additional services … And I wanted to start Bind. Luckily there’s a way around it, but it requires a little trick. The default environment runs 1 shell script at startup that can be edited. You can place all kind of things in there, and it will run at boot. The only thing you need to remember is that because of the chrooted environment things are never where you expect them to be. :)
To find the script, called autorun.sh, you need to mount the partition it’s on first. Log on via SSH and mount the partition:

mount /dev/mtdblock5 /tmp/config

Now it’s mounted in /tmp/config, for some reason everybody likes to mount it there but I suppose you can mount it everywhere.
In /tmp/config, edit or create a file called autorun.sh and make it executable:

chmod +x autorun.sh

After you are done, unmount the partition before you reboot:

umount /tmp/config

The contents of the autorun.sh:

#!/bin/sh
(sleep 30;/opt/sbin/named -c /opt/etc/named/named.conf)&

You can reboot the system via the command:

reboot

You can also use the webinterface for the reboot, it does exactly the same.
Most important in the script is that the system needs the 30 seconds wait because it takes about that time for the normal boot to finish and set up the chrooted environment that is needed for the /opt to be pointing to the correct place. I found out the hard way, it took me several hours to understand what was happening during the boot sequence.
The configuration files for named can be found in /opt/etc/named, not in the /etc map, as that will be cleared upon reboot.

Go ahead and try it. I could replace my 24×7 home server with this device as I also moved my home mailserver Zarafa to a remote system, but more on that later. I even found packages for Asterisk available, maybe I even want to play around with that although I wonder if the CPU is able to do the real time codec translations.
Article  Technical  home  named  nas  qnap  server  from google
may 2009 by stateless
Home Fileserver: A Year in ZFS
Doesn’t quite have the same ring to it as ‘A Year in Provence’, does it? Oh well, never mind After a year of using Solaris and ZFS for a home fileserver, I thought I would share my experiences here …
zfs  nas  home  dyi 
may 2009 by stateless
Introducing the RipNAS Statement
Less than 2 months ago, Terry reviewed a Windows Home Server from Illustrate called the RipNAS.  This appliance was designed around WHS and marketed toward the home media enthusiast.  Well, the guys over at Illustrate are at it again.  Introducing the

RipNAS Statement™


Make a Statement ‘Ripping NAS at home in the living room’

So what is so special about the Statement?  Other than this nifty looking appliance is a “CD Ripper, Windows Home Server, Network Attached Storage & Media Streamer in one silent box”, the

RipNAS Statement challenges the conception that NAS devices have to be hidden away, both for acoustic and aesthetic reasons, Statement is visually stunning, whilst acoustically designed. Statement combines Audio CD ripping, Network Attached Storage and Media Streaming in one Hi-Fi standard 43cm width box.

The Statement is available in two models: the super-silent Statement SSD (Solid State Drives), or the Statement HDD (more storage capacity).

Illustrate continues to blur the fine-line between data centers and media centers. Since the “Statement is pre-configured to work right out of the box, no computer, keyboard, LCD or computer skills are required. Say goodbye to audio CD clutter and move your media into the convenience of the digital age”, we are coming a lot closer to that Plug-N-Play do-all home server appliance that the average consumer is looking for.  And did I mention The Wife Factor?

Manufacturer: Illustrate
Model: RipNAS Statement

Price: From £ TBD/$ TBD
Web: RipNAS

Similar Posts:Hands On: RipNAS Home Server

RipNAS & RipNAS Essentials for Windows Home Server

Add-In: Asset UPnP Media Server

mCubed Information Technology Combines WHS Hardware and RipNAS Into One Svelte Package

Omwave Launch Windows Home Server Hardware in France
Digital_Media_Receiver  Windows_Home_Server  Windows_Home_Server_Hardware  nas  video  illustrate  ripnas  silent  Solid_State_Drives  SSD  statement  we-got-served  wgs  whs  Hardware  akihabara  clevery  japan  SP9C  from google
may 2009 by stateless
Storage Alignment Document
NetApp has recently released TR-3747, Best Practices for File System Alignment in Virtual Environments. This document addresses the situations in which file system alignment is necessary in environments running VMware ESX/ESXi, Microsoft Hyper-V, and Citrix XenServer. The authors are Abhinav Joshi (he delivered the Hyper-V deep dive at Insight last year), Eric Forgette (wrote the Rapid Cloning Utility, I believe), and Peter Learmonth (a well-recognized name from the Toasters mailing list), so you know there’s quite a bit of knowledge and experience baked into this document.

There are a couple of nice tidbits of information in here. For example, I liked the information on using fdisk to set the alignment of a guest VMDK from the ESX Service Console; that’s a pretty handy trick! I also thought the tables which described the different levels at which misalignment could occur were quite useful. (To be honest, though, it took me a couple of times reading through that section to understand what information the authors were trying to deliver.)

Anyway, if you’re looking for more information on storage alignment, the different levels at which it may occur, and the methods used to fix it at each of these levels, this is an excellent resource that I strongly recommend reading. Does anyone have any pointers to similar documents from other storage vendors?

This article was originally posted on blog.scottlowe.org. Visit the site for more information on virtualization, servers, storage, and other enterprise technologies.

Storage Alignment Document

Similar Posts:Storage Short Take #2

NetApp OSSV with VMware ESX Server

Virtualization Short Take #19

Recovering Data Inside VMs Using NetApp Snapshots

Virtualization Short Take #22
Storage  Virtualization  CLI  ESX  HyperV  NAS  NetApp  NFS  ONTAP  VMFS  VMware  from google
march 2009 by stateless
A Little Marvell Plugs Sub-Netbook Gap
As I've been telling anyone who would listen, one of the key recent trends has been the "race to the bottom" in terms of pricing for computer systems. The only real winner here (aside from the end-user) is open source - proprietary systems cannot cut prices enough, and are rarely flexible enough to allow the kind of experimentation that is necessary at this end of the market. Here's another great example of the kind of thing I have in mind:Can a computer get any smaller and cheaper than a netbook? Marvell Technology Group Ltd. thinks so.The Silicon Valley chip maker is trying to create a new category of inexpensive, energy-efficient devices it calls "plug computers," for which it would supply the integrated processors.Strongly resembling those vacation timers that turn on your lights at night to ward off potential robbers, a plug computer is more of a home networking gadget that transforms external hard drives or USB thumb drives into full network-attached storage (NAS) devices. Aside from the form-factor, the other thing of note is the expected price for these GNU/Linux-based systems:Marvell has already announced a handful of other resellers that plan to build plug computers. But it hopes to attract far more, so that it can eventually price its SheevaPlug chips low enough for vendors to profitably sell plug computers for as little as $49, Mukhopadhyay said.At first sight, it's not clear why anyone would want one of these extremely small computers; but at prices around $50 you can bet all kinds of unexpected uses will start popping up. It's not hard to imagine a day when a house or office is full of dozens of tiny, low-cost and low-energy GNU/Linux-based devices, all talking to each other and other systems across the Net. Juding be the speed at which netbooks have caught on, it's probably closer than we think.
nas  marvell  plug_computers  netbooks  from google
february 2009 by stateless
2031: Enhancements to NetApp Cloning Technology
This session provided information on enhancements to NetApp’s cloning functionality. These enhancements are due to be released along with Data ONTAP 7.3.1, which is expected out in December. Of course, that date may shift, but that’s the expected release timeframe.

The key focus of the session was new functionality that allows for Data ONTAP to clone individual files without a backing Snapshot. This new functionality is an extension of NetApp’s deduplication functionality, and is enabled by changes within Data ONTAP that enable block sharing, i.e., the ability for a single block to appear in multiple files or in multiple places in the same file. The number of times these blocks appear is tracked using a reference count. The actual reference count is always 1 less than the number of times the block appears. A block which is not shared has no reference count; a block that is shared in two locations has a reference count of 1. The maximum reference count is 255, so that means a single block is allowed to be shared up to 256 times within a single FlexVol. Unfortunately, there’s no way to view the reference count currently, as it’s stored in the WAFL metadata.

As with other cloning technologies, the only space that is required is for incremental changes from the base. (There is small overhead for metadata as well.) This functionality is going to be incorporated into the FlexClone license and will likely be referred to as “file-level FlexClone”. I suppose that cloning volumes with be referred to as “volume FlexClone” or similar.

This functionality will be command-line driven, but only from advanced mode (must do a “priv set adv” in order to access the commands). The commands are described below.

To clone a file or a LUN (the command is the same in both cases):

clone start <src_path> <dst_path> -n -l

To check the status of a cloning process or stop a cloning process, respectively:

clone status
clone stop

Existing commands for Snapshot-backed clones (”lun clone” or “vol clone”) will remain unchanged.

File-level cloning will integrate with Volume SnapMirror (VSM) without any problems; the destination will be an exact copy of the source, including clones. Not so for Qtree SnapMirror (QSM) and SnapVault, which will re-inflate the clones to full size. Users will need to run deduplication on the destination to try to regain the space. Dump/restores will work like QSM or SnapVault.

Now for the limitations, caveats and the gotchas:

Users can’t run single-file SnapRestore and a clone process at the same time.
Users can’t clone a file or a LUN that exists only in a Snapshot. The file or LUN must exist in the active file system.
ACLs and streams are not cloned.
The “clone” command does not work in a vFiler context.
Users can’t use synchronous SnapMirror with a volume that contains cloned files or LUNs.
Volume SnapRestore cannot run while cloning is in progress.
SnapDrive does not currently support this method of cloning. It’s anticipated that SnapManager for Virtual Infrastructure (SMVI) will be the first to leverage this functionality.
File-level FlexClone will be available for NFS only at first. Although it’s possible to clone data regions within a LUN, support is needed at the host level that isn’t present today.
Because blocks can only be shared 256 times (within a file or across files), it’s possible that some blocks in a clone will be full copies. This is especially true if there are lots of clones. Unfortunately, there is no easy way to monitor or check this. “df -s” can show space savings due to cloning, but that isn’t very granular.
There can be a maximum of 16 outstanding clone operations per FlexVol.
There is a maximum of 16TB of shared data among all clones. Trying to clone more than that results in full copies.
The maximum volume size for being able to use cloning is the same as for deduplication.

Obviously, VMware environments—VDI in particular—are a key use case for this sort of technology. (By the way, in case no one has yet connected the dots, this is the technology that I discussed here). To leverage this functionality, NetApp will update a tool known as the Rapid Cloning Utility (RCU; described in more detail in TR-3705) to take full advantage of file-level FlexCloning after Data ONTAP 7.3.1 is released. Note that the RCU is available today, but it only uses volume-level FlexClone.

Similar Posts:VM File-Level Recovery with NetApp Snapshots

LUN Clones vs. FlexClones

Recovering Data Inside VMs Using NetApp Snapshots

NetApp FlexClones with VMware, Part 1

Using NetApp Deduplication with Block Storage
Storage  Deduplication  Insight2008  NAS  NetApp  NFS  ONTAP  Snapshots  WAFL  from google
december 2008 by stateless
Year One with the Linux Based NAS Server
Just because a hurricane hit us doesn't mean I can't
write a blog post!

Last September we "stood up", for the very first time, our
CentOS
Linux based cluster to replace the aged and unsupported Tru64
TruCluster. It was not all that long ago in fact that I wrote the wrap-up article to that adventure, so I
guess this is a postscript.

First off, the fact that I have changed roles has influenced
several things around the file server: A new manager took over my team,
and when the server had a problem she suddenly found out she had a
Linux file server she was responsible for. It is documented seven ways
from Sunday on the Wiki : Dan is amazing about things like that. The
problem of course is that when everything is working, no one reads the
doc. When it fails they don't have time. Dan works with me on my new
team, but went back and fixed the file server for the old team a couple
of times, and here is the nut of what this article is about. What I am
about to say here is going to be true about any and every complicated
bit of technology that people rely on every day: it will not be limited
to just Linux.

You have to know how to use the technology.

The Linux NAS was never advertised as being as good as the
TruCluster that proceeded it, but when it failed it took people that
understood TruCluster / Tru64 / ADVFS to fix it. Same thing with any
technology stack I have ever worked with.

Technology is only as good as
the people and
process that support it. See ITIL for details.

This is a truth that I think about all the time in my new role
as a technologist. 10% of the work is designing the solution. The rest
of it is training, communicating, and then going back and retraining
some more (more than likely).

Along comes this hurricane named Ike, and it is huge: As big
as the state of Texas from side to side. Houston's power grid crumbled
before Ike. The Linux NAS server has a weak spot in the design: It will
not run without electrons. I know, I know: We should have had wind
power as a backup. Next time....

Upon the return of power, the Global File System that
underlies the core design of the NAS marks many high I/O, high usage
file systems as needing repair and they will not mount. The log says
that the file system has been "withdrawn":

 ---------------------
GFS Begin ------------------------

 WARNING: GFS filesystems withdraw
    GFS: fsid=rnd-fs:p4_gfs.1: withdrawn:

 WARNING: GFS withdraw events
    
[<ffffffff884c3c94>] :gfs:gfs_lm_withdraw+0xc4/0xd3:
    GFS: fsid=rnd-fs:p4_gfs.1: about to
withdraw from the cluster:
    GFS: fsid=rnd-fs:p4_gfs.1: telling LM to
withdraw:

 WARNING: GFS fatal events
    GFS: fsid=rnd-fs:p4_gfs.1: fatal:
filesystem consistency error:

 ---------------------- GFS End -------------------------

This is system admin 101 stuff: FSCK and fix stuff, and you
are back running... except that in the cluster and GFS the commands
name is not FSCK. And you can not just FSCK: here then is what Dan
wrote on our Wiki about how to recover from this:

HOWTO: Recover a GFS filesystem from a "withdraw" state
When a corrupt GFS filesystem structure is discovered by a
node,
that node will "withdraw' from the filesystem. That is, all I/O for the
corrupted filesystem will be blocked on that node to prevent further
filesystem corruption. Note, other nodes may still have access to the
filesystem as they have not discovered the corruption.

halt/reboot - Use a hardware
halt on the node that is in the "withdraw" state and then
reboot that node.


Note: A simple reboot command should work,
but on our version
of the cluster it seems to hang in the GFS umount stage on the
withdrawn filesystem. So a hard reboot of the node seems to be required
at this time.

umount ${MOUNT_POINT} - Un-mount the
filesystem on ALL NODES!

gfs_fsck ${BLOCK_DEVICE} - To run a full
fsck. Run on one node only!

mount ${MOUNT_POINT} - On all nodes to
restore service.


Note: nfsd will hang on the withdrawn filesystem. You may need to relocate the NFS service to a surviving node first!

Since being in production, Dan has had to do this particular
recovery action about four times. Ike only gets credit for this last
one. The other three times had to do with a single node failing and
leaving I/O pending. This in turn appears to be the ILOM card in the
node acting up.

Next time: Some other handy Linux cluster things to know before
your Linux based cluster fails...


digg it



_____
tags:

CentOS


CentOS 5.2


Centos


Dan Goetzman


Data Center Linux


Enterprise Linux Server


GFS


Global Warming


ITIL


Linux


Linux HA


Linux NAS


Linux NAS Server


Linux in the datacenter


NAS


NAS Server


NFS


Network Attached Storage


Tru64


Tru64 TruCluster
CentOS  CentOS_5.2  Dan_Goetzman  Data_Center_Linux  Enterprise_Linux_Server  GFS  Global_Warming  ITIL  Linux  Linux_HA  Linux_NAS  Linux_NAS_Server  Linux_in_the_datacenter  NAS  NAS_Server  NFS  Network_Attached_Storage  Tru64  Tru64_TruCluster  from google
september 2008 by stateless
CentOS 5 NAS Cluster
As I noted in my last post, here is an update on where we are at
with replacing our trusty but aged Tru64 TruCluster NAS server with a
new HA NAS Server. The new server is a CentOS 5 based cluster with
three nodes. I'll get into the particular in a second, but first,

How We Got Here (in a nutshell)
Digital created the best cluster software in the world,
VAXCluster. Digital ported this to Tru64. Digital was sold to Compaq.
Compaq continued Tru64 and TruCluster. We had a NAS appliance. We
bought another. It failed and failed and failed, for over a year. We
replaced that with the TruCluster. HP bought Compaq and killed the
AlphaChip and Tru64 TruCluster future development. Our TruCluster
aged, and we began to look at replacements. Two appliance vendors
came in, were tested, failed. Tru64
started to have issues with new NFS clients. We started our
in-House
HA NAS testing based off our years of Tier II NAS using Linux.
Pant Pant Pant. Whew. Twenty plus years of history in one paragraph!

What We Liked About Tru64 TruCluster
It may be true that we over-engineered the Tru64 NAS solution.
After being burned so badly by the appliance, and having so many
critical builds depend on the server, we were not prepared for
anything other than the most reliable NAS we could figure out how to
build. Tru64 was tried and true. TruCluster was the best cluster
software there was for UNIX, and the Alphachip was the hottest chip
on the block back then. It all seemed to be a no-brainer.

Once built, we had rolling upgrades, and while a node might fail,
the service stayed up. Customer facing (my customer being of course
BMC R&D) outages were few and far between, and while we had data
loss issues once (leading to the Linux
snapshot servers), never a server failure. TruCluster let us
sleep at night.

We hoped that Linux clustering would one day catch up to
TruCluster, and so watched
things like the Linux SSI project with great interest.

Whatever we ultimately use, it has to pass the NAS
Server testing suite of tests.

Re-Thinking the NAS Solution
We knew what we liked about TruCluster, but after seven years, we
also decided it was time to question some of its very basic design
assumptions. We came up with a new set, tested the two new appliances
against them, and then decided to try to build
it ourselves out of Linux parts we found laying about the OSS
World.

On the assumption that a picture is worth a large quantity of
words, here
is a DIA diagram, saved as PNG I drew of the new beastie:

http://lh3.google.com/stevecarl/RzTqSfdxlLI/AAAAAAAAADI/S3txOOy_KrQ/s144/lcfs-ha.png

Words Anyway
And now that we have that picture, a fair quantity of words is
probably in order explaining what in the heck that is all about.

The Servers are Sun X2200 M2's running CentOS 5 and Cluster Suite.
An X2200 is small, but it is big enough to keep the gig pipe full, so
we do not need anything bigger.

To make all the cluster stuff happen, we are using Cluster LVM
over the top of the Linux Multipath drivers. Each device has two
paths because there are two switches in the SAN fabric, and each
cluster node is hooked to each switch. GFS lays in on top of that to
create the global file system across all the nodes.

Node one runs NFS. Node two runs Samba. Node three runs the
backups. Should the NFS or Samba node fail, the service will restart
on one of the surviving nodes, and since the file system is global to
all three nodes, no magic occurs at the service level to move the
file systems or anything.

The Spinning Bits
The disks are the ever nifty Apple
Xserve RAID units. We burn a fair amount of capacity for HA on
these: The RAID 5 is 5+1, with a hot spare, for a total of seven
disks per RAID controller. The disks are 750 GB SATA. There are 14
disks in each shelve, and we have two shelves, for a total of 15
Terabytes of capacity, before formatting.

There is a single point of failure here: there is a single RAID
card over each side, and so even though there are two cards in the
shelve, each card only manages half the disks. They do not talk to
each other. This is not Enterprise grade storage.

We mitigate that risk by having bought the spares kits: We have
spare disks in carriers, spare RAID card, and spare RAID card
battery. This was part of the rethink: we decided to save some money
on the disks but have a recoverability plan. It is not that it will
never go down, but that we can get it going again quickly. The gear
is all on three year hardware support, so broken bits are a matter of
RMA'ing things, and everything should be designed to return to
service quickly.

We have over a year of runtime on these units on the second tier
storage, and have not had any serious issues thus far, thus our
willingness to try this configuration out.

Testing and Migration
In addition to all the run time on similar gear, we have been
beating the heck out of these. By “We” I of course mean “Dan”,
the master NAS blaster. Here is his Wiki record of the problems and
the workarounds from the testing:

NFSV2 "STALE File Handle" with GFS filesystems
Problem Description
Only when using NFSV2 over a GFS filesystem! NFSV3 over GFS
is OK. NFSV2 over XFS is also OK.We were able to duplicate
this from any NFSV2 client;

cd /data/rnd-clunfs-v2t -
To trigger the automount


ls - Locate one of the test
directories, a simple folder called "superman"


cd superman - Step down
into the folder


ls - Attempt to look at the contents, returns the
error:


ls: cannot open directory .: Stale NFS file handle
Note: This might be the same problem as in Red Hat
bugzilla #229346Not
sure, and it appears to be in a status of ON_Q, so it is not
yet released as a update. If this is the same problem, it's clearly a
problem in the GFS code.

Problem Resolution
To verify that this was indeed the same bug as the Red Hat buzilla
#229346, I found the patch for the gfs kernel module and applied it
to our CentOS cluster.The patch does indeed fix this
problem!Instructions to apply the patch;

Download the gfs kernel module
source, gfs-kmod-0.1.16-5.2.6.18_8.1.8.el5.src.rpm (if
your kernel is 2.6.18_8.1.1.el5)


rpmbuild -bp
gfs-kmod-0.1.16-5.2.6.18_8.1.8.el5.src.rpm - Unpack the source
rpm to /usr/src/redhat/SOURCES


cd /usr/src/redhat/SOURCES and add the following
patch;


Filename: gfs-nfsv2.patch

--- gfs-kernel-0.1.16/src/gfs/ops_export.c_orig 2007-08-31 09:43:29.000000000 -0500
+++ gfs-kernel-0.1.16/src/gfs/ops_export.c 2007-08-31 09:43:52.000000000 -0500
@@ -61,9 +61,6 @@

atomic_inc(&get_v2sdp(sb)->sd_ops_export);

- if (fh_type != fh_len)
- return NULL;
-
memset(&parent, 0, sizeof(struct inode_cookie));

switch (fh_type) {

cd /usr/src/redhat/SPECS and make the following
changes;


Filename: gfs-kernel.spec

Name: %{kmod_name}-kmod
Version: 0.1.16
Release: 99.%(echo %{kverrel} | tr - _) <--Change the version from 5 to 99--<<<<
Summary: %{kmod_name} kernel modules

Source0: gfs-kernel-%{version}.tar.gz
Patch0: gfs-nfsv2.patch <--Add this line--<<<<
Patch1: gfs-kernel-extras.patch
Patch2: gfs-kernel-lm_interface.patch

%setup -q -c -T -a 0
%patch0 -p0 <--Add this line--<<<<
pushd %{kmod_name}-kernel-%{version}*
%patch1 -p1 -b .extras
%patch2 -p1

rpmbuild -ba --target x86_64
gfs-kmod.spec - Build the new patched kmod-gfs rpm package


rpm -Uvh
/usr/src/redhat/RPMS/kmod-gfs-0.1.16-99.2.6.18_8.1.8.el5.x86_64.rpm
- Install the patched gfs module


depmod -a - Required step
to see the new module on reboot


Reboot the system to load the new kernel


NFSV2 Mount "Permission Denied" on Solaris clients
Problem Description
Certain Solaris clients, Solaris 7, 8, and maybe 9, fail with
"Permission Denied" on mount when using NFSV2. Apparently
the problem is a known issue in Solaris when the NFS server ( in
this case CentOS ) offers NFS ACL support. Apparently, Solaris
attempts to use NFS ACL's even with NFSV2 where they are NOT
supported.

This problem has been fixed on more recent versions of Solaris
(like some 9 and all 10+).

Note: This problem was detected on a previous test/evaluation of Red Hat AS 5 and expected with CentOS 5.
Disclaimer: I think this is a accurate description of the problem.
Problem Resolution
Assume Solaris NFS clients will NOT use NFSV2?

Cluster Recovery Fails on "Power Cord Yank Test"
Problem Description
The cluster software must fence a failed node successfully before
it will recover a cluster service, like NFS or Samba. The fence
method used in our configuration is the Sun X2200 Embedded LOM via
remote ipmi. When the power cord on the X2200 servers is
disconnected, the ELOM is also down. This causes the fence operation
to the ELOM to fail. The cluster configuration allows multple fence
methods to be defined to address this issue. But there appears to be
a bug in this version of the software that prevents the ccsd (Cluster
Configuration Service Daemon) from answering the fenced "ccs_get"
request for the alternate fence method when a node has failed.

Problem Resolution
None at this time. Waiting on a fix from CentOS. Assumption is
that we can run with this configuration, but the cluster will not
failover services if a power cord or both power supplies on one of
the X2200 nodes were to "fail". This would result in a
service interruption.

And there you have it so far: We have our teams home directories
running on the new servers, and other than being fast, we see no real
difference yet. We are trading in a few problems on Tru64 for a few
possible problems on CentOS 5, but we assume that we'll be able to
either work around them (Such as making Solaris clients use V3, which
they tend to prefer anyway) or with a patch to Cluster Services at
some point to deal with the power cord issue.

Next time: “The Numbers of NAS” -or- “Speeds and Feeds for
the … [more]
Apple  Apple_Xserve_RAID  CIFS  CentOS  CentOS_5  Linus_Server  Linux  Linux_NAS_server  NAS  NFS  Network_Attached_Storage  Samba  Tru64  Tru64_TruCluster  TruCluster  from google
november 2007 by stateless
x4100 + SunRay + VMware: Virtual Lab
I’ve put together a lab with an x4100, VMware ESX, a SunRay and an OSX laptop. This provides the infrastructure in my home office, and a super platform for experimenting with various software and architectural components.

Key components are:

Sun x4100 w/ 4 cores and 8Gb Memory
VMWare ESX 3.01
Solaris 10 x86 virtual machine (vm) running SunRay server
SunRay 1G appliance
Several Windows XP and Windows Server vm’s
cAos Linux vm running DHCP, bastion SSH and caching DNS
PowerBook G4 client
Synergy keyboard/mouse virtualization
Apple 23″ 1920×1200 Cinema Display
SyncMaster 1024×768 Display

Key features:

SunRay client to access Solaris 10 via X and Windows via RDP (uttsc)
X on the Mac for access to Linux and Solaris desktops with xnest and rdesktop
Synergy to share keyboard/mouse between SunRay and OSX

There are a lot of nits to go through, most of which I have workarounds for:

Sadly, there is some incompatibility between the SunRay and my Apple Cinema Display. (Update, this now works!)
Synergy software looses connection between OSX and the SunRay session when the screensaver activates.
Cisco VPN can’t be launched from an RDP client.

The setup shows the technical feasibility for virtualizing Windows, Linux and Solaris desktops with VMware, using SunRay as a thin-client to access displays.

The typical method for virtualizing Windows instances uses Terminal Server or Citrix. This method deploys individual Windows virtual machines, typically Windows XP Professional, allowing users greater control over their “machine”.

I’m thinking about setting up the SunRay with a Windows session in kiosk mode for one of my daughters. If she can’t break it, I think its a good initial indication of usability.

The cloning capabilities of VMware make keeping “clean” installs of various base types a breeze, except for one small niggle: I’m running out of disk. I think my next project will be a white-box iSCSI or NAS server that VMware can use for additional storage.

cAos, Cisco VPN, iSCSI, NAS, Solaris 10, Sun x4100, SunRay, vmware, VMWare ESXcAos, Cisco VPN, iSCSI, NAS, Solaris 10, Sun x4100, SunRay, vmware, VMWare ESX
IT_Architecture  cAos  Cisco_VPN  iSCSI  NAS  Solaris_10  Sun_x4100  SunRay  vmware  VMWare_ESX  from google
october 2006 by stateless

related tags

akihabara  Apple  Apple_Xserve_RAID  appliance  article  atom  best_practice  buildlist  cAos  CentOS  CentOS_5  CentOS_5.2  CIFS  Cisco_VPN  clevery  CLI  cluster  Customization  Dan_Goetzman  Data_Center_Linux  Deduplication  Digital_Media_Receiver  discussion  Disks  diy  dyi  e350  embedded  Enterprise_Linux_Server  ESX  fileserver  freenas  GFS  Global_Warming  glusterfs  guide  ha  hack  Hacks  hardware  Hard_Drives  hdd  home  howto  HyperV  illustrate  Insight2008  install  interesting  interesting/hardware  iscsi  ITIL  itx  IT_Architecture  japan  latency  Linus_Server  linux  Linux_HA  Linux_in_the_datacenter  Linux_NAS  Linux_NAS_server  Linux_NAS_Server  marvell  mpio  N7700PRO  named  nas  NAS_Server  netapp  netbooks  netgear  network  Network_Attached_Storage  news  nexenta  nfs  ONTAP  opensolaris  openstorage  performance  personal  plug_computers  Projects  proxmox  qnap  raid  readydata  recommendation  reference  repurpose  Repurposing  review  ripnas  Samba  san  server  silent  small  smartos  Snapshots  software  solaris  Solaris_10  Solid_State_Drives  SP9C  SSD  statement  storage  sun  SunRay  Sun_x4100  synology  sysadmin  Technical  thecus  Top  Tru64  Tru64_TruCluster  TruCluster  video  virtualisation  Virtualization  VMFS  vmware  VMWare_ESX  WAFL  we-got-served  wgs  whs  windows  Windows_Home_Server  Windows_Home_Server_Hardware  zfs 

Copy this bookmark:



description:


tags: