ZFS on Linux resilver & scrub performance tuning

No Comments

Improving Scrub and Resilver performance with ZFS on Linux.

I’ve been a longtime user of ZFS, since the internal Sun beta’s of Solaris Nevada (OpenSolaris).
However, for over a year i’ve been running a single box at home to provide file storage (ZFS) and VM’s and as I work with Linux day to day, chose to do this on CentOS, using the native port of ZFS for linux.

I had a disk die last week on a 2 disk RAID-0 mirror.
Replacement was easy, however reslivering was way to slow!

After hunting for some performance tuning ideas, I came across this excellent post for Solaris/IllumOS ZFS systems and wanted to translate it for Linux ZFS users. http://broken.net/uncategorized/zfs-performance-tuning-for-scrubs-and-resilvers/

The post covers the tunable parameter names and why we are changing them, so I won’t repeat/shamelessly steal. What I will do is show that they can be set under linux just like regular kernel module parameters:

[root@ZFS ~]# ls /etc/modprobe.d/
anaconda.conf blacklist.conf blacklist-kvm.conf dist-alsa.conf dist.conf dist-oss.conf openfwwf.conf zfs.conf

[root@ZFS ~]# cat /etc/modprobe.d/zfs.conf
options zfs zfs_arc_max=2147483648 zfs_top_maxinflight=64 zfs_resilver_min_time_ms=5000 zfs_resilver_delay=0

Here you can see I have set the zfs IO limit to 64 from 32, the resilver time from 5 sec from 3 and the delay to zero. Parameters can be checked after a reboot:

cat /sys/module/zfs/parameters/zfs_resilver_min_time_ms

Result: After a reboot, my resilver speed increased from ~400KB/s to around 6.5MB/s.

I didn’t tweak anymore, it was good enough for me and had other things to get on with.

One day i’ll revisit these to see what other performance I can get out of it. (I’m aware on my box, the RAM limitation is causing me less than ‘blazing fast’ ZFS usage anyway)

Happy Pools!

[root@ZFS ~]# zpool status
pool: F43Protected state: ONLINE
scan: resilvered 134G in 2h21m with 0 errors on Tue Jun 24 01:07:12 2014

pool: F75Volatile
state: ONLINE scan: scrub repaired 0 in 5h41m with 0 errors on Tue Feb 4 03:23:39 2014

 

Matt

CF Push and case insensitive clients

No Comments

So here’s a weird one that may save someone some time…..

Trying to perform a cf push with a .jar file. Get’s the following strange error!

matt@matjohn2-mac Downloads $ cf push test-tgs –path tgs-0.0.1-SNAPSHOT-jar-with-dependencies.jar
Instances> 1
1: 128M
2: 256M
3: 512M
4: 1G
Memory Limit> 1G
Creating test-tgs… OK
1: test-tgs
2: none
Subdomain> test-tgs
1: ft4.cpgpaas.net
2: none
Domain> ft4.DOMAINREMOVED
Creating route test-tgs.ft4.DOMAINREMOVED… OK
Binding test-tgs.ft4.DOMAINREMOVED to test-tgs… OK
Create services for application?> n
Save configuration?> y
Saving to manifest.yml… OK
Uploading test-tgs… FAILED
Upload failed. Try again with ‘cf push’.
Errno::EEXIST: File exists – /var/folders/3z/lfrkq96s43d3f02g2mybkf280000gn/T/.cf_c01c12c8-50bb-45d5-9c7b-f73577d6ae27_files/META-INF/license
cat ~/.cf/crash # for more details
matt@matjohn2-mac Downloads $ cat ~/.cf/crash
Time of crash:
  2013-11-05 22:43:16 +0000
Errno::EEXIST: File exists – /var/folders/3z/lfrkq96s43d3f02g2mybkf280000gn/T/.cf_c01c12c8-50bb-45d5-9c7b-f73577d6ae27_files/META-INF/license
rubyzip-0.9.9/lib/zip/zip_entry.rb:600:in `mkdir’
rubyzip-0.9.9/lib/zip/zip_entry.rb:600:in `create_directory’
rubyzip-0.9.9/lib/zip/zip_entry.rb:184:in `extract’
cfoundry-4.3.6/lib/cfoundry/zip.rb:28:in `block in unpack’
rubyzip-0.9.9/lib/zip/zip_entry_set.rb:35:in `each’
rubyzip-0.9.9/lib/zip/zip_entry_set.rb:35:in `each’
rubyzip-0.9.9/lib/zip/zip_central_directory.rb:109:in `each’
rubyzip-0.9.9/lib/zip/zip_file.rb:132:in `block in foreach’
rubyzip-0.9.9/lib/zip/zip_file.rb:90:in `open’
rubyzip-0.9.9/lib/zip/zip_file.rb:131:in `foreach’
cfoundry-4.3.6/lib/cfoundry/zip.rb:24:in `unpack’
cfoundry-4.3.6/lib/cfoundry/upload_helpers.rb:58:in `prepare_package’
cfoundry-4.3.6/lib/cfoundry/upload_helpers.rb:42:in `upload’
cf-5.2.0/lib/cf/cli/app/push.rb:84:in `block in upload_app’
interact-0.5.2/lib/interact/progress.rb:98:in `with_progress’
cf-5.2.0/lib/cf/cli/app/push.rb:83:in `upload_app’
cf-5.2.0/lib/cf/cli/app/push.rb:67:in `setup_new_app’
cf-5.2.0/lib/cf/cli/app/push.rb:48:in `push’
mothership-0.5.1/lib/mothership/base.rb:66:in `run’
mothership-0.5.1/lib/mothership/command.rb:72:in `block in invoke’
cf-5.2.0/lib/manifests/plugin.rb:137:in `call’
cf-5.2.0/lib/manifests/plugin.rb:137:in `block in create_and_save_manifest’
mothership-0.5.1/lib/mothership/callbacks.rb:74:in `with_filters’
cf-5.2.0/lib/manifests/plugin.rb:136:in `create_and_save_manifest’
cf-5.2.0/lib/manifests/plugin.rb:73:in `wrap_push’
cf-5.2.0/lib/manifests/plugin.rb:25:in `block in <class:ManifestsPlugin>’
mothership-0.5.1/lib/mothership/command.rb:82:in `instance_exec’
mothership-0.5.1/lib/mothership/command.rb:82:in `block (2 levels) in invoke’
mothership-0.5.1/lib/mothership/command.rb:86:in `instance_exec’
mothership-0.5.1/lib/mothership/command.rb:86:in `invoke’
mothership-0.5.1/lib/mothership/base.rb:55:in `execute’
cf-5.2.0/lib/cf/cli.rb:187:in `block (2 levels) in execute’
cf-5.2.0/lib/cf/cli.rb:198:in `save_token_if_it_changes’
cf-5.2.0/lib/cf/cli.rb:186:in `block in execute’
cf-5.2.0/lib/cf/cli.rb:122:in `wrap_errors’
cf-5.2.0/lib/cf/cli.rb:182:in `execute’
mothership-0.5.1/lib/mothership.rb:45:in `start’
cf-5.2.0/bin/cf:16:in `<top (required)>’
ruby-1.9.3-p448/bin/cf:23:in `load’
ruby-1.9.3-p448/bin/cf:23:in `<main>’
ruby-1.9.3-p448/bin/ruby_executable_hooks:15:in `eval’
ruby-1.9.3-p448/bin/ruby_executable_hooks:15:in `<main>’

Lots of head scratching;

Looks like (and is) a local error, unpacking the JAR, we haven’t even touched the PaaS/cloudfoundry yet!

Also, unpacking the JAR manually seems to complain also, hmm.

You’re running MacOSX? … (Or maybe Windows?)

Looks like the issue is because the JAR has been built on a CASE-SENSITIVE OS (in our case, Jenkins on Linux)

… and you’re trying to run cf push on a CASE-INSENSITIVE OS (in our case, a shiny Retina MacBookPro).

Workaround..

Run the same cf push from a linux box and it works fine.

This link put me onto the issue (as the error is more than a bit confusing!);

https://groups.google.com/forum/#!topic/selenium-users/f8OMertwzOY

Hope this helps someone!

Openstack and PaaS; Love a good geek rift!

No Comments

I’ve been in the bay area, CA for a couple of weeks now (excluding my cheeky jaunt to vegas!) and even though i’m now on vacation, it’s been the perfect place to watch the OpenStack Havana Drama unfold; Mostly stemming from this catalyst;

http://www.mirantis.com/blog/openstack-havanas-stern-warning-open-source-or-die/

Well (for me, anyway) especially this bit;

Too many times we see our customers exploring OpenShift or Cloud Foundry for a while, and then electing instead to use a combination of Heat for orchestration, Trove for the database, LBaaS for elasticity, then glue it all together with scripts and Python code and have a native and supported solution for provisioning their apps.

Hell no! Was my initial reaction, and while there has been a definite retraction from the tone of the whole post… I still think a hell no is where I stand on this.

And i’ll tell you why, but firstly;

  • I like Openstack, as an IaaS. I like it’s modularity for networking and the innovation taking place to provide a rock-solid IaaS layer.
  • It was a much needed alternative to VMWare for a lot of people and it’s growth into stability is something i’ve enjoyed watching (competition is never a bad thing right! ;) ).

That said, here’s why I’ll take my PaaS served right now, with a sprinkling of CloudFoundry;

  • People tying things together themselves with chunks of internally-written scripts/python (i’d argue even puppet/chef as we strive for more portability across public/private boundaries) is exactly the kind of production environment we want to move away from as an industry;
    • Non-portable.
    • Siloed to that particular company (or more likely, project/team.)
    • Often badly maintained due to knowledge attrition.

.. and into the future;

  • Defined, separated layers with nothing connecting them but an externally facing API was, in my mind, the very POINT of IaaS/PaaS/XaaS and their clear boundaries.
  • These boundaries allow for portability.
    • Between private IaaS providers and the PaaS/SaaS stack.
    • Between public/private cloud-burt style scenarios.
    • For complex HA setups requiring active/active service across multiple, underlying provider technologies.
      • think ‘defence in depth’ for IaaS.
      • This may sound far fetched, but actually is and has already been used to back SLA’s and protect against downtime without requiring different tooling in each location. 
    • I just don’t see how a 1:1 mapping of PaaS and IaaS inside OpenStack is a good thing for people trying to consume the cloud in a flexible and ‘unlocked’ mannor.

It could easily be argued that if we are only talking about private and not public IaaS consumption, i’d have less points to make above; Sure, but I guess it depends on if you really believe the future will be thousands of per-enterprise, siloed, private IaaS/PaaS installations, each with their own specifics.

As an aside, another concern I have with Openstack in general right now is the providers implementing OpenStack. Yes there is an OpenStack API, but it’s amazing how many variations on this there are (maybe i’ll do the maths some day);

  • API versions
  • Custom additions (i’m looking at you, Rackspace!)
  • Full vs. Custom implementation of all/some OpenStack components.

Translate this to the future idea of PaaS and IaaS being offered within OpenStack, and i see conflicting requirements;

From an IaaS I’d want;

  • Easy to move between/consume IaaS providers.
  • Not all IaaS providers necessarily need the same API, but it would be nice if it was one per ‘type’ to make libraries/wrappers/Fog etc easier.

From a PaaS i’d want;

  • Ease of use for Developers
  • Abstracted service integration
    • IaaS / PaaS providers may not be my best option for certain data storage
    • I don’t want to be constrained to the development speed of a monolithic (P+I)aaS stack to test out new Key-Value-Store-X
  • Above all, PORTABILITY

This seems directly against the above for IaaS…

Ie, I don’t mind having to abstract my PaaS installation/management from multiple IaaS API’s so that I can support multiple clouds,(Especially if my PaaS management/installation system can handle that for me!); however i DON’T want lots of potential differences in the presentation in my PaaS API causing issues for the ‘ease of use, just care about your code’ aspect for developers.

I’m not sure where this post stopped becoming a nice short piece and more of a ramble, but i’ll take this as a good place to stop. PaaS vendors are not going anywhere imho and marketing-through-bold-statements on the web is very much still alive ;)

Matt

 

 

 

 

Pivotal Cloud Foundry Conference and Multi-Site PaaS

No Comments

So I recently got back from Pivotal’s first Cloud Foundry conference;

pivotalcfconf

 

as I’m not a developer, I guess by the power of deduction I’ll settle for cloud leader.

While there, this newly appointed cloud leader, erm, lead a pretty popular discussion on multi-site PaaS (with a focus on Cloud Foundry, but a lot of the ideas translate) with the intention of stirring up enough technical and business intrigue to move the conversation into something of substance and action.

I’m about to kick off further discussions on the VCAP-DEV mailing list (https://groups.google.com/a/cloudfoundry.org/forum/#!forum/vcap-dev), but draft run here can’t hurt

  • Having multiple separate PaaS instances is cheating (and a pain for everyone involved)
  • My definition of multi-site PaaS is currently to support application deployment and horizontal scalability in two or more datacentres concurrently, providing resilience scalability beyond what is currently available.
  • The multi-site PaaS should have a single point of end user/customer interaction.
  • Key advantages should be simplified management, visibility and control for end customers.
  • Multi-Site PaaS should be architected in such a way as to prevent a SPOF or inter-DC dependencies

All well and good saying WHAT i want, how about how?

  • A good starting point would be something that sits above the current cloud foundry components (ie, something new that all api calls hit first) that could perform extensions to existing functions;
    • Extend API functionality for multi-site operations
      • cf scale –-sites 3 <app>
      • cf apps –-site <name>
      • etc
    • Build up a map of PaaS-wide routes/DNS A records to aid in steering requests to the correct sites ‘Gorouters’
    • This may also be a good place to implement SSL with support for per-application SSL certificates as these will need to be available in any site the application is spanned to also.
  • Further thoughts that i need to pin down
    • I’d like to see this getting health state from each DC and redirecting new requests if necessary to best utilize capacity (taking into account latency to the client of the request)

Where this layer will sit physically, and how it will become this unification layer without becoming the SPOF point itself is still playing around my mind.
Current thoughts include;

  • Calling the layer Grouter (global-router) for convenience.
  • This layer will be made up of multiple instances of GRouters
  • Passive listening to NATS messages WITHIN the G-Router instances site for all DNS names/apps used at that site.
  • Distributed protocol for then sharing these names to all other Grouters (NOT A SHARED DB/SPOF).
    • No power to overwrite another G-Routers DB
    • Maybe something can be taken from the world of IP routing where local routes outrank that of remote updates, but can be used if need be.
    • Loss of DB/updates causes Grouter instance to fail open and forward all requests to local GoRouters (failure mode still allows local site independent operation).
    • DNS should only be steering real PaaS app requests or API calls to the GRouter layer although may be for different DC, Depending on use of Anycast DNS *needs idea development*
    • Grouters In normal mode can send redirects for requests for Apps that are In other DC’s to allow a global any-cast DNS infrastructure
  • Multiple instances per DC just like GoRouters so they scale.

Other points which will need discussion are;

  • How does a global ‘cf apps’ work
  • Droplets will need saving to multi-DC blob stores for re-use in scale/outage scenarios
  • We will need a way to cope with applications that have to stay within a specific geographic region.
  • Blue/Green zero downtime deployments?
    • One site at a time
    • Two app instances within each DC and re-mapping routes
    • Second option would prevent GRouter layer needing to be involved in orchestration, reducing complexity.

My next steps are to map out use cases and matching data flows embodying my ideas above.

Just FYI, all this was written on a SJL>LHR flight were awake-ness was clinging to me like the plague, so expect revisions!

Matt

 

Google Nexus 4 Smartphone

No Comments

I was going to write about the new Nexus 4 i’ve finally managed to get my hands on, and why, after many years of my mobile phones all having names beginning with ‘i’, I’m actually finding this new android device hard to fault…

But this guy pretty much 100% summaries my thoughts for me, right down to why previous attempts for me running android have failed.. and therefore saves me the trouble! Worth a read, whichever side of the fence you are on!

http://gizmodo.com/5973073/an-iphone-lovers-confession-i-switched-to-the-nexus-4-completely

 

It used to just be paperwork that took time to work out!

No Comments

So the delivery of a new corporate laptop this week got me thinking;

It’s much more powerful, portable and generally nicer than anything I own and the fact of the matter is; if I’m near a computer for five out of the seven days every week, it’s going to be this one.

This ties into my last post about BYOD and staff ‘bringing their own data’. This works backwards to corporate-supplied devices too;

BYOD is currently, lets be honest, only really bringing your own ‘addon’ devices, and not your full digital working requirement.

BYODv2 in my mind will be where users fully bring every bit of local compute power they need to work, maybe minus the clutter/heavyness of standard peripherals such as screens and keyboards, instead have a uniform, wireless dock interface.

So for as long as there are corporate laptops, which their users are more likely to have on them than any personal laptop for a massive percentage of their weekly lives, there will always be remnants of their own personal data from lunchtimes, weekends, corporate travel, after works etc.

I digress, back to MY new shiny corporate laptop;

  • I needed a kick to get all my data, which is currently spread everywhere into some semblance of order and peace of mind over backups etc.
  • I hate always being on the wrong device for what I’m looking for (ooh, I took that photo on my iPhone, not my laptop/tablet/Nexus 4/Tomorrows device number seventy six and a half).
  • I don’t like having to swap between personal/corp equipment, especially now the corp laptop is actually much more powerful than my own!

Turns out, this isn’t something I should have thought about, as while the cloud solves some of these data-everywhere questions, it also opens up a lot more choice, ie a lot more questions on howto go about all this, it’s amazing just how many branches you end up with in your mind when you try and tackle that which, on the face of it, is a simple ‘get to my data on my devices’ question.

Even without going into all the corporate access stuff, my personal digital life and (more to the point) access to it, created this monstrosity, can you do any worse? Have you had the same thoughts and come up with a solution that works well? Do you want to talk to me about cool ideas for BYODv2 above? Comment or e-mail.

-Matt

Mind Map of how to get to my data on all my devices

Oh dear, it’s 3AM again

With ‘Bring your Own Device’ will come ‘Bring your Own Data’

No Comments

A friend put me on to http://owncloud.org/, which despite it’s awful naming (oooh cloud! That’s new and good isnt it? snore!) is an open source implementation of some of Apples iCloud featureset (from what I can see) which can be hosted anywhere you want.

At first I was about to hit download, as a techy, running my own things in my own VM’s or somewhere I control the data security, backups etc, makes me feel a bit better; but then I realized something which I think a lot of people will realize, techy or not;

I’m already using my own Phone/Tablet for both work and for personal. My corporate dropbox/exchange/data stores will be much better backed up (one would hope) than anything I’m going to run locally and probably better than any very cheap consumer-level ‘cloud’ data stores; and even if not it doesn’t cost me anything.

… I’ll just backup/store my contacts/calendar/photo’s on there.

If you’re encouraging people to Bring their Own Devices to into your corporate infrastructure, don’t be surprised if they bring some personal data too. Expect ‘Personal’ or ‘Private’ or ‘NonWork’ directories to start appearing in users dropboxes, expect groups of contacts named ‘personal’ so they can filter them on their phone when they are not working. Have you capacity planned for this when thinking about BYOD?

For the record, I don’t think it’s such a bad thing and I think taking a hardline stance against this will slow down BYOD adoption; the company is making savings on client endpoints, the users therefore must feel it is ‘worth’ tying their own devices into a possibly restrictive corporate policy.

Just my 10p
Keep the change!

Virtualization Principals with Paravirtualized IO

1 Comment

This isn’t really a proper post, more just some little notes I can point people too (which I guess, technically is a blog post, ahh well!).

So Virtualisation, have used VMWare products (most of) and Xen previously in production, with Virtualbox my desktop-virt product of choice for testing on my local machine for some years now, but times change and my current view is this;

- XEN is disappearing from many distro’s (including the ones I mainly use in production) and being replaced with KVM.
- VMWare VCenter/ESXi is a bit overkill for my test/home/local machine VM’s stuff.
- Virtualbox is good but annoys me that I need to install extra kernel modules etc and updates (even if it’s done through DKMS) when my kernel supports KVM anyway!

So I’ve moved a lot of home/test/local machine VM’s to KVM.

KVM Vs XEN
Not going into performance (many, many better testers have spent more time looking at this than I), but just to clear up a couple of things. The reason distro’s have moved away from XEN DOM-0/DOM-U support in favor of KVM is that “KVM is Linux, XEN isnt”.

By this, we mean;
- KVM is a hypervisor made from the components of your Linux kernel, it is this Linux kernel of the your linux install placed onto the bare metal that runs on the privileged area of your processor, providing hardware assisted virtualization to guests on CPU’s that support VT-x/AMD-v. You’ll notice in this install you can still see the virtualisation CPU extensions in ‘/proc/cpuinfo’ as it’s this OS that IS the hypervisor;

processor : 2
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz
stepping : 10
cpu MHz : 2666.583
cache size : 3072 KB
physical id : 0
siblings : 4
core id : 2
cpu cores : 4
apicid : 2
initial apicid : 2
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm dts tpr_shadow vnmi flexpriority
bogomips : 5333.16
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:

- XEN on the other hand is nothing todo with linux, it’s a separate hypervisor that runs it’s own non-linux microkernel and then boots one instance of linux as it’s first guest, called DOM-O, this first guest has the privileges to control the underlying XEN hypervisor, but it’s still a guest, you wont see your CPU flags even in your DOM-0 because it’s not linux running directly on the hardware, it’s XEN. You can also see this in grub, notice how your kernel and initrd are both passed to XEN as modules, it’s XEN grub is actually booting.

Hardware Virtualization – Peripherals (virtual)
So with hardware virtualization, we are no longer having to translate/emulate instructions (or VM’s memory pages with the newer EPT stuff) to and from the processor, that’s handled in hardware reducing massive overhead. However, even though the code is running ‘native’ on the processor, we still have to present hardware to our guest OS’s.
This was previously done by emulating very common hardware (read; old) in software to present to guests (so that guests will already have the drivers for that hardware).

Implementing old hardware does not necessarily limit the virtual hardware devices to the same performance limits as the original hardware (for example the motherboard models implemented in KVM/VMWare/XEN etc can support way more PCI/PCIe ‘devices’ than the original board had slots for).

KVM Uses QEMU to present the device hardware to each virtual machine and there is some performance degradation by having the following peripherals emulated in software;

- Network Cards
- Storage controllers for HDD/SSD storage

VirtIO / Paravirtualized Hardware
To get around this, the idea of paravirtualized hardware has been created. This changes the Virtualization model somewhat from;

The guest OS doesn’t know it’s being virtualized, it runs on what it thinks is standard hardware using drivers the OS already has.

to;

We don’t really care that the guest OS knows it’s on a Virtualization host, so why not improve throughput to the hosts hardware by giving the guests drivers to better interact with the host/hypervisor layer in terms of passing I/O for disks, network cards etc.

This of course means the guest OS will needs special drivers based on whatever hypervisor we are using underneath, but then dispenses with the idea that it’s ‘real’ hardware, these paravirtualized guest drivers implement a more direct way of getting I/O to and from the hypervisor without having to emulate a fake piece of hardware in software.

VMWare has the VMNET(1/2/3) network card, for which you need drivers from the ‘VMWare Guest Tools’ installer. This is a paravirtualized NIC which has no basis on any real network card and gives better performance than the e1000e offered as another option in VMWare.

XEN had the xenblk and xennet drivers which did the same thing for NIC’s and Storage controllers. VMWare has paravirt storage controller drivers too I just can’t remember their names :)

KVM (and now Virtualbox can too) use something called ‘VirtIO’.

What is VirtIO?
VirtIO is exactly the same principal as the above offerings from VMWare and XEN, only it’s a movement to standardize paravirtualized peripheral drivers for guest operating systems accross multiple virtualization platforms, instead of each Hypervisor development team implementing their own guest-side drivers.

So for KVM, if you want better performance out of your Network/Disk I/O within a virtual machine, you’ll be wanting to use ‘VirtIO’ devices instead of emulated hardware devices.

More information can be found here; http://www.linux-kvm.org/page/Virtio

VirtIO also includes a driver to allow memory balooning, much like VMWare with the baloon driver within the VMWare guest tools.

It is worth mentioning here that this information is NOT the same as Intel VT-d or single root IO virtualization (SR-IOV) these are also related to virtual machine/guest OS’s and how they interface with hardware, but in a very different way;

- VT-d technology allows for a physical device (such a NIC or graphics card in a physical PCI-E slot on the host machine) to be passed directly to a guest, the guest will use the drivers for that particular device and will speak natively to and from the hardware. This required VT-D extensions on a CPU to work and a hypervisor capable of utilizing the technologies.

- SR-IOV allows for multiple virtual machines to see hardware devices, ie share the physical hardware device yet all still access the raw hardware natively just as one guest could with VT-d technology above. IE, 10 guests could share 1Gb/s each of a 10Gb/s physical network card, using the physical network card drivers directly in the guest (to support all that network card’s features and remove the need for the hypervisor to be involved in the I/O) without the need for emulated hardware or paravirtualized drivers. The hardware device (such as a NIC) needs to be designed to support SR-IOV and so far only a handful of hardware components have been.

OpenSolaris / Solaris Express to Solaris 11 boot Issues

3 Comments

I have had a trusty Solaris box at home now for 5-6 years running a few things;
- ZFS for my files, sharing out through SMB for media, iSCSI for playing with Netbooting and VMware shared storage.

- Xen (More recently) running on a Solaris Dom0 hosting a number of Centos5 DomU’s for other linux server based stuff.

- Multicast/Bonjour spoofing and apple filesharing making an excellent ‘fake’ timemachine for backing up my Macbook pro onto ZFS (works flawlessly and doesn’t have a single disk prone to failure unlike the time capsules

Over that time, I’ve either in-place upgraded, or overwritten the OS and let the new version of Solaris import the ZFS pool from;

Solaris 10, Solaris SNV_8X (Sun Internal), Solaris SNV_9X (Sun Internal), OpenSolaris (SNV_1XX), Oracle Solaris Express (SNV_151).

And everything was pretty much good :) Until now, now I tried to take the latest update, moving to the newly released Solaris 11.

Lots of things have changed in Solaris 11 compared to the SNV/OpenSolaris/SolarisExpress years (well, i’m not saying there hasn’t been a lot of changes during that time, just none that have negatively affected me, where as these do);

- Support removed for Linux branded Solaris Zones
- Support removed for Solaris 11 to be a Xen Dom0, or indeed be the base of any form of visualization solution apart from Solaris zones and VirtualBox (Guessing to allow Oracle to push it’s visualization product)
- No check in the ‘pkg update’ procedure as to wether the Xen kernel was in use before upgrade.

So, cutting to the POINT OF THE POST, I updated, a new boot environment was created, update successful, rebooted, boot fails!

You could just boot the previous Boot Environment, which works, but this is what you’ll need to do to boot the new BE;

1. Open the grub menu.lst from /rpool/boot/grub/menu.lst
2. Find the last entry in the file (named after the Boot Environment you’re having issues with)
3. Remove the references to Xen, as below;

Before;


title example-solaris-1
findroot (pool_rpool,0,a)
bootfs rpool/ROOT/example-solaris-1
kernel$ /boot/$ISADIR/xen.gz console=vga dom0_mem=2048M dom0_vcpus_pin=false watchdog=false
module$ /platform/i86xpv/kernel/$ISADIR/unix /platform/i86xpv/kernel/$ISADIR/unix -B $ZFS-BOOTFS
module$ /platform/i86pc/$ISADIR/boot_archive

After;

title example-solaris-1
findroot (pool_rpool,0,a)
bootfs rpool/ROOT/example-solaris-1
kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS
module$ /platform/i86pc/$ISADIR/boot_archive

We have just removed the Xen kernel and options and instead told grub to boot the ‘normal’ Solaris kernel. It seems pkg update don’t check for this when upgrading.

Now reboot and try the Boot Environment from the grub menu, should load fine and after some information about upgrading the SMF versions, you’ll be ready to login.

The second issue I found after this is that my SMB shares were not available, seemed that the SMB service was stopped due to dependencies, starting the following services magically made my shares come back to life;


svcadm enable idmap
svcadm enable smb/client
svcadm enable smb/server

Verify with ‘share’;

matt@F43-PSRV1:~# share
IPC$ smb - Remote IPC
Matt /F43Datapool/Matt smb -
Public /F43Datapool/Public smb -
c$ /var/smb/cvol smb - Default Share

I hope this helps someone, the last thing I have to work out is whether VirtualBox will provide as stable a solution for my Linux VM’s as Xen (as it seems to be the only option I have now, apart from moving back to Linux and losing ZFS/SFM/Crossbow/Comstar etc which I really don’t want to do).

That said, it really annoys me that Oracle have removed such a simple and powerful combination of Xen Dom0 and ZFS in the base solaris image, it served a perfect need for people who don’t need a full, separate, virtualization product such as testing, home use, small businesses etc. Why remove Dom0 support but keep DomU support! Anyone know?

O2 exposing mobile number of website visitors?

No Comments

Here’s something that seems a little interesting, O2 appear to be sending a header of the end users mobile number, to any website visited over their mobile data network.

Header is ‘x-up-calling-line-id’
Other networks don’t feel the need, I wonder what their reasoning is, either way, questionable privacy fail here!

More info here;
http://lew.io/headers.php

Older Entries