Tuesday, August 19, 2014

Die Konqueror Die

What is the benefit of having a browser that doesn't work?  I would argue that it actually detracts from the experience of using the environment.

KDE does not gain or loose followers for having or not having a native browser.  Good support to render web sites well in practice is the only criteria, and arguably the only browsers that serve are Firefox and Chrome.


I'm all for competition, but now that Konqueror is looking for a maintainer is as good a time as any to shelve the product, write it off and re-allocate those resources to making Chrome and Firefox integrate better with KDE.  That would arguably gain KDE more users and make existing ones happier - it is a cause with more value.

P.S. That the web browser also functions as a document viewer and file manager is irrelevant.  Nobody uses it for those function five minutes after they discovered the need for one of the better, dedicated programs, if ever they did.

P.P.S I'm not even going to mention the other disfunctional KDE web browser.

Tuesday, January 7, 2014

How to update your CNTLM password

CNTLM is awesome for enabling the use of Linux in a Microsoft dominated workplace. In particular getting onto the Internet when you need to authenticate with Microsoft Domain credentials.

CNTLM runs on your Linux system as a small proxy server. It received requests for connections to web based services and adds the necessary Microsoft authentication Meta-data to the outgoing packets before forwarding them on to the upstream "corporate" proxy servers.

For this CNTLM stores your credentials in a text file, usually /etc/cntlm.conf ... This file is checked on start-up.

When your Domain password changes, you need to "inform" CNTLM of the new password to use to get past the corporate proxy system. This is done by updating the cntlm.conf file and restarting the CNTLM service.

I recommend that, because MS domain and authentication data often "takes a while to propagate to all services", that the following overall process be followed.

  1. Prevent any programs/devices from using old passwords (otherwise these may get you locked out of the network)
  2. Change your Domain password. Write it down (In a safe place)
  3. Go have a cup of coffee or do whatever you like, give the network "a while" (20 minutes) to propagate your new password.
  4. Log off and back on, access the Internet via MS Internet Explorer, access your web-based exchange, etc... what ever you find convenient to make sure that your password is updated throughout the network.
  5. Follow the steps below to update your CNTLM password.
  6. Re-enable programs and devices with your updated password. This may include your Exchange account on your smart phone, proxy settings in your Linux package manager etc.

The CNTLM configuration file stores the following authentication details:

  1. MS domain Name
  2. MS domain user name
  3. MS domain authentication type (Usually NTLMv2)
  4. MS domain user password or a Hashed version of the password

It is recommended to use a hashed version of the password (in stead of the actual password) in the configuration file. CNTLM includes a way of generating the updated hash so that you do not need to store the password in plain text, which adds a layer of security to the system, eg besides the fact that the file is readable only by the superuser.

Start by preventing any applications from using the proxy - I use ProxyManager to disable the Proxy settings everywhere, ensuring that nothing will try to connect while the update is in progress.

johan@Komputer:~$ p-off
Disabling for KDE global
Disabling for S3cmd
Disabling for Dropbox
Disabling for VirtualBox
Disabling for Git global
Disabling for Wine IE
Disabling for Curl
Disabling for wget
Disabling for APT
Disabling for Root Bash
Disabling proxy for Root Curl

CNTLM must run when you update the password - start it if it is not.

johan@Komputer:~$ ps -ef|grep cntlm
cntlm     2102     1  0 Jan04 ?        00:00:05 /usr/sbin/cntlm -U cntlm -P /var/run/cntlm/cntlm.pid
johan    31452 30162  0 10:02 pts/2    00:00:00 grep --color=auto cntlm

You need to "be root" to update the CNTLM configuration

johan@Komputer:~$ sudo -s
root@Komputer:~# cntlm -IM http://test.com
Password: 
Config profile  1/4... OK (HTTP code: 302)
----------------------------[ Profile  0 ]------
Auth            NTLMv2
PassNTLMv2      FEDCBA9876543210CC747CDB22103C1D
------------------------------------------------

What happens is that CNTLM Prompts for a new password, and uses the Domain and User details from the config file to connect to the test URL provided. It tries all the known methods of authentication, and when a working method is found it displays the hash and method of authenticating.

Use a text editor to update the configuration file using the displayed details. Save the file and then restart CNTLM to get it to read the updated hash from the configuration file.

root@Komputer:~# /etc/init.d/cntlm restart
Stopping CNTLM Authentication Proxy: cntlm.
Starting CNTLM Authentication Proxy: cntlm.                                                                                                                    
root@Komputer:~# exit

Finally re-enable the Proxy in all applications.

johan@Komputer:~$ p-on
Enabling for APT
Enabling for Root Bash
Enabling for Root Curl (eg for Yast)
Enabling for KDE global
Enabling for wget
Enabling for Curl
Enabling for S3cmd
Enabling for VirtualBox
Enabling for Git global
Enabling for Wine IE
Enabling for Dropbox

Now if only there were a way to change the password for CNTLM, Contacts sync on my Android phone, Calendar Sync on my Tablet, MS Linc client on my phone, AND on the Microsoft domain, all at once.

Tuesday, December 17, 2013

The year of the Linux Desktop rode out on a Unicorn

The year of the Linux desktop is a myth.  It will never happen.  It has not happened.  And it isn't in progress either.

Oh, make no mistake.  Chrome OS is not even the second to latest entrĂ©e.  Right after Canonical's Edge Ubuntu Phone came the Free Forever Steam OS.

Of course canonical is a little evil: They want to make money off of Linux.  We used to hate Red Hat for that, we turned our backs on them, and they went ahead and captured the internet server market regardless.  I can't help it, even yours truly is Red Hat certified.

Canonical arguably single handedly put Linux on the Desktop market.  It is much, much easier to install software on Linux than it is to do so on Windows.  If you are not already used to the Windows convoluted way of doing things, then learning how to navigate your way around a Gnome or KDE desktop is certainly a lower learning curve.  These days even printing and scanning works better on Linux, and PDF's just stops being impenetrable and immutable solid files.

To put a case in point: When the Windows "Fix Network Connection" function doesn't work, go ahead and try to figure out why.  When your (wife's) Windows system (again) doesn't want to print, go ahead and try to figure out why.  After all she didn't change anything and it worked yesterday.

Kubuntu is beautiful without the new-fangled desktop paradigm that realy doesn't belong on a non-touch based system.  Linux is beautiful.  Ubuntu made Linux user friendly.

But we can not give Canonical all the benefit.  Google took a page out of Microsoft's book and gave us Android with Games.  Yes, when 1989's Windows gamers grew up and went to work in corporates they did not expect OS/2 on their office PCs, they wanted Windows.  Because Microsoft neglected to prosecute them in the nineties and let them play games when they were teenagers.

And Yes, Google is also a little bit evil (aka trying to make money out of us, trying to prevent us from going anywhere else with our data, spying on our search and email and buying habits)... but in the mean time they are building out on the Linux base, and I bow to them for that.  More users = more justification for big players (AMD, Intel, nVidia) to support Linux.  Not that there are any real alternative options, but still.

I must say I honestly do not care whether my Device Drivers are open source or not.  For all I care the driver can secretly use my GPU to generate BitCoins for AMD whenever my PC is idle.  All I care about is that my Device driver works well, supports all the hardware features, works on Linux, be supported and updated, and be included with the cost of the hardware.  As if I ever read the code to make sure there are no backdoors.

Linux is so prevalent these days that it is becoming nearly a household name.  I do blame Google a little for not making it more obvious that Android is based on Linux, but that is just PR - Linux has got a stigma that it is not for the average Joe attached to it.  Which android is.

Which reminds us of Java.  Can anybody remember who created Java.  Up until 3 years ago we all had Java.  It was something on any phone that could download and run apps.  It ran web based games and it did everything in between.  Those with very keen eye sight might have noticed the minutely small Sun Microsystems logo in the bottom right corner of the Java web page.  I call it a glaring, stupid failure to capitalize on an opportunity to market.  Sun Microsystems were in everybody's homes, but IBM were buying full page adds in Computer and gaming magazines.  Everybody knows IBM = Computers, but sadly Sun Microsystems, the original graphical workstation makers, are now little more than a memory for many, and essentially never were known outside of the core industry.

The Desktop came and went without Linux making it.

Thursday, December 13, 2012

Why hard drives are smaller than expected

People often ask why their Terabyte hard drive isn't a terabyte and time and again the simple, not necessarily false, answer given is that it is a marketing ploy by the evil manufacturers. But there is another answer.

In the good old days there was only the SI units for prefixes - A thousand meters in a kilometer, a thousand grams in a kilogram, and that is how we like it. Engineers and Scientists insist that it be so, well mostly they do. The SI standards organisation defines the prefixes this way.

The binary nature of digital computers lends itself to working with powers of two for units. The problem comes in with how close 1000 happens to be to the value 210 - the difference was considered negligible while designing computers and writing early computer system manuals. The habit stuck and the prefix "Kilo" in computer terms became interchangeable for the value 1024. It would be wasteful to use 1000 as the demarcation for many computer allocation units because powers of two allign well and make for more effecient and cost-effective designs.

This limitation is however specific to situations where bits are processed, stored or transfered in parallel. This includes processors, memory banks and system busses. Serial media, such as communication lines, networks, and hard drives do not suffer from this limitation. (It must be noted that while it is convenient to think of data stored on a hard drives as parallel bits, natively hard drives, just like tape devices, read and write bits in serial.)

A "kilo"-byte turns out to be a convenient measure for quantity of data. The difference also appears negligible at first glance, and using it this way feels comfortable to humans. Note however that the margin of error increases as we move to higher order numbers.

Prefix Order Binary prefix value Decimal prefix Value Deviation
Kilo 1 1,024   1,000   2.40%
Mega 2 1,048,576  1,000,000  4.86%
Giga 3 1,073,741,824  1,000,000,000  7.37%
Tera 4 1,099,511,627,776 1,000,000,000,000 9.95%
Peta 5 1,125,899,906,842,620 1,000,000,000,000,000 12.59%

The net effect is that a Terabyte hard drive is nearly 10% less than what you would expect its size to be!

As mentioned earlier, not all devices on a computer operate in parallel: networks are mostly serial lines. The phone and Digital lines that connects our homes to the Internet communicate in serial. The venerable computer mouse is a serial device. These days the USB protocol is used for just about anything and the "S" in USB in fact stands for "Serial".

Because hard drives in actual fact store data in serial (even "parallel" drives like ATA and SCSI drives eventually convert the data to a serial stream of ones and zeros), they follow the SI prefix specification for number of Bytes in a Gigabyte, while memory modules, which must maximize the investment in bus width and capacity, incorrectly follows a binary interpretation of the decimal prefixes!

The SI system only recognizes the powers-of-ten meaning of the prefixes. A new set of binary prefixes have been defined, though it is not part of the SI standard!

Kilo 1,000 = KB  1,024 = KiB (Kibi Byte)
Mega 1,000,000 = MB  1,048,576 = MiB (Mibi Byte)
Giga 1,000,000,000 = GB 1,073,741,824 = GiB (Gibi Byte)

Hard Drives, Modems, network cards, and airoplanes are designed by engineers following the SI standard and their size specification conforms to the traditional SI meanings. Memory modules follow the size specifications of the Binary prefix system, but marketing brands these with SI decimal prefixes. We as consumers are therefore spoiled since we get more than what we pay for with RAM!

There are however two other items worth mentioning.

The first is Solid State storage devices, such as flash drives. Like RAM these are absed on a natively paralle media and bits needs to be counted and maximized for optimal capacity and effectiveness. Yet these are marketed the same way traditional hard drives are - with the SI meaning of GB or Gigabyte. You would think that (ignoring file system overheads) you should be able to store a GB of data from ram into a 1-GB solid state drive! Blame this one on marketing and exploitation of the people who have come to expect a "1 GB hard drive to be less than 1 GB"

The second is the size of files stored on a hard drive. These are commonly shown with KB having the binary system meaning in stead of the SI meaning. This is despite common storage media used to be natively formatted as serially accessible streams - hard drives and tapes. I assume this may be in part because the writers of the early general purpose operating systems were so deeply ingrained in thinking about a Kilo-byte as 1024 bytes that they never considered doing it the other way, and possibly because those files had to be loaded into core memory which is allocated in chunks which have sizes that are powers-of-two.


So there you have it - don't blame the marketing guys for the missing space on your hard drive, thank them for the extra space on your memory modules. Blame the engineers though. :-)

Wednesday, November 14, 2012

Finding space for Solaris Live Upgrade

Something that is often perceived to be an obstacle to using Solaris Live Upgrade is finding space to give to Live upgrade. There are fortunately quite a few options to help out.

Oracle of course recommends that you use spare disks or buy more disks. That is all well and fine for big corporates with deep pockets... assuming that you have slots available to plug in more disks.

So … on to the more attainable options.

To start off it helps to know how much space you will need to perform the actual upgrade. Solaris itself needs about 7GB for a Full plus OEM installation, excluding logs, crash dumps, spool files, and so on. Use the df command to check how much space is used by the root and other critical file systems (/var, /opt, /usr). While this is a good starting point, you may not need to replicate all of that if it includes 3rd-party software that stays the same.

Since you are also able to combine and split file systems, as well as explicitly “exclude” portions of the boot environment through the -x or -f options, it is possible to get an estimate of the amount of space needed from Live Upgrade. To find out how much space Live Upgrade will need, run the lucreate command that you plan on using up to the point where it displays the estimated space requirement, and then press Ctrl-C to abort it.

Option 1. By far the simplest scenario is if you are running on a ZFS root system already, then you are in luck: Live Upgrade has got good support for ZFS root, at least in recent versions of the tool, eg since Solaris 10 Update 6. It can take a snapshot of the root file system(s) and create a clone automatically, and then simply apply changes, like patches or an upgrade, to the clone.

ZFS makes it almost too easy, the command is simply:

# lucreate -c Sol10u5 -n Sol10u8

The command will automatically locate the Active Boot Environment (PBE) and utilise free space from the pool (It checks the active BE to determine what ZFS pool is to be used). The -c option above causes the Active BE to be named explicitly (Sol10u5 in this example), and the -n assigns a name for the new Alternate BE, eg Sol10u8. (Let's just assume I'm going from update 5 to update 8)

It is a good idea to name your BEs based on the version of the operating system that they have, especially on ZFS; With ZFS it is (too) easy to have many BEs, eg for testing and backup purposes.

When using ZFS clones, the space required to create a new BE is less than with other file systems. This is because the contents of the “clone” points at the same blocks on disk as the original source data. Once a block is written to (from either the clone or the origin), the Copy-On-Write part of ZFS takes care of the space allocation. Data that doesn't change will not be duplicated!

You can therefore safely use the traditional methods for estimating your disk space requirements and rest assured that you will in practice need less than that.

Option 2: Another ZFS pool, other than the one that you boot from, may have free space, or you may want to move to another pool on separate disks for any other reason. When you explicitly specify a ZFS pool different from the source pool, Live Upgrade will copy the contents in stead of cloning. Assuming a target ZFS pool name of “NEWPOOL, the command would be:

# lucreate -c Sol10u5 -n Sol10u8 -p NEWPOOL

As before the active BE is probed to determine the source for live upgrade.

Note that I as a habit use upper-case names for my ZFS pool names. That is because I like them so much. It is also because it makes them stand out as pool names in command outputs, particularly df and mount!

Not really a separate option as such, but worth mentioning here: With ZFS boot being new, people often want to migrate from UFS to a ZFS root – The commands are the same as when migrating from one ZFS pool to another – once again the source is automatically based on the active BE and only the destination is specified.

You must be running (or going to upgrade to) at least Solaris 10 release 10/08 (S10 update 6) in order to utilize ZFS root. If running Solaris earlier than Update 2 then it will not be possible to use ZFS since the kernel must also support ZFS, not only the Live upgrade tools.

In the below example I create the new ZFS root pool using the drive c0t0d1:

# zpool create RPOOL c0t0d1s1
# lucreate -c Sol10u5 -n Sol10u8 -p RPOOL

The lucreate command will copy the active BE into the new ZFS pool. Note: You don't have to actually upgrade. Once the copy (create process) completes, run luactivate and reboot to switch over to ZFS.

# luactivate Sol10u8
# init 6

After checking the system, clean up …

# ludelete Sol10u5
# prtvtoc /dev/rdsk/c0t0d1s2 | fmthard -s - /dev/rdsk/c0t0d0s2
# zpool attach RPOOL c0t0d1s1 c0t0d0s1

I want to highlight that I specified partition (slice) numbers above. Generally the recommendation is to gives ZFS access to the “whole disk”, but for booting it is a requirement to specify a slice.

A few extra considerations: The second disk is not automatically bootable, but rather than being redundant I will just link this excellent information

Now that you are on ZFS root you should also configure swap and dump “the ZFS way” - see here

If you choose for whatever reason not to move to ZFS root yet, maybe you are still not running Solaris 10 update 6 or later that supports ZFS booting, then you still have some options.

Option 3: Check whether you have free, unallocated space on any disks. The “prtvtoc” command will show areas on disk that are not part of any partition, as in the below example:

# prtvtoc /dev/dsk/c1t1d0s2

If any space is not allocated to a partition, there will be a section in the output before the partition table like this

* Unallocated space: 
*     First  Sector    Last 
*     Sector Count    Sector 
*     2097600 4293395776  526079 
*     8835840 4288229056  2097599 

If so, create a slice to overlay those specific cylinders (I do this carefully by hand on a case-by-case basis), and then use the newly created slice.

Note: A Solaris partition is called a disk-slice by the disk management tools. On X86, there is a separate concept called a partition, which is a BIOS partitioning. In this situation, all Solaris disk slices exist inside the Solaris tagged partition.

Option 4: If you do not have unallocated space on any disks, you might still have unused slices... Be careful though – unmounted is not the same as unused! Check with your DBAs whether they are using any raw partitions, ie partitions without a mounted file system. I've also seen cases where people unmount their backup file systems as a “security” measure, though the value in that is debatable.

It may be worth mentioning that when looking for space, you can use any disks in the system, it does not have to just be one of the first two disks, or even an internal disk.

To specify a slice to use as target you use the -m option of lucreate, eg

# lucreate -c Sol10u5 -n Sol10u8 -m /:/dev/dsk/c0t0d1s6:ufs

The above command will use /dev/dsk/c0t0d1s6 as the target for the root file system on the new BE.

You can also use SVM meta-devices. For example

# metainit d106 c0t0d1s6
# lucreate -c Sol10u5 -n Sol10u8 -m /:/dev/md/dsk/d106:ufs

Or on a mirror (assuming two free slices)

# metainit -f d11 1 1 c0t0d0s0
# metainit -f d12 1 1 c0t1d0s0
# metainit d10 -m d11 d12
# lucreate -c Sol10u5 -n Sol10u8 -m /:/dev/md/dsk/d10:ufs

Note that the traditional “metaroot” step is left to Live-upgrade to handle, and that the mirror volume in the example is created without syncing because the slices are both blank! You could always rather attach the second sub-mirror in the traditional way just to be safe.

Option 5: Split /var and the root file systems. If you have two slices somewhere but neither is large enough to hold the entire system, this could work. Then after completing the upgrade, you can delete the old BE to free up the original boot disk, and “migrate” back to that. It involves a bit of work, but you would use Live upgrade for this migration, which is exactly the kind of thing that makes Solaris so beautiful.

The commands to split out /var from root would look like this.

# lucreate -c Sol10u5 -n Sol10u8 -m /:/dev/md/dsk/d10:ufs -m /var:/dev/md/dsk/d11:ufs

When you compare this with the previous example you will notice there is an extra -m option for /var. Each mount point specified with -m will become a separate file system in the target BE. Adding an extra entry for /usr or any other file system works in same way. To better understand the -m options, think of them as instructions to Live upgrade about how to build the vfstab file for the new BE.

Note that non-critical file systems, eg anything other than root, /var, /usr and /opt are automatically kept separate and considered as shared between BEs.

Option 6: Temporarily deploy a swap partition or slice to use as a root file system. This would work if you have “spare” swap space. Don't scoff - I've many a times seen systems that have swap space configured purely for purposes of saving core dumps. The commands would be

# swap -x /dev/dsk/c0t0d0s0
# lucreate -c Sol10u5 -n Sol10u8 -m /:/dev/dsk/c0t0d0s0:ufs

There would be some clean-up work left once everything is done, for example deleting the old BE and creating a new swap partition from that space.

Option 7: A final option is to break an existing SVM mirror. In this case it will not be necessary to copy the file system over to the target, because due to the mirror, it is already there. The meta-device for the sub-mirror is also already there. We will however create a new single-sided SVM Mirrored volume from this sub-mirror for this process.

To do this you specify two things: A new “mirror” volume, as well as the sub-mirror to be detached and then attached to the new mirror volume.

Assuming we have d10 with sub-mirrors d11 and d12, we will create a new mirror volume called d100. We will remove d12 from d10, and attach it to d100. A single lucreate command takes care of all of that:

# lucreate -c Sol10u5 -n Sol10u8 -m /:/dev/md/dsk/d100:mirror,ufs -m /:/dev/md/dsk/d12:detach,attach,preserve

To examine the above command: You can see that -m is specified twice, both times for root. The first have the tag or “mirror,ufs” and it creates the new mirror volume. The second have tags “detach,attach,preserve”. Detach: Live upgrade needs to detach it first. Attach: Do not use it directly, in stead attach it to the volume. Preserve: No need to reformat and copy the file system contents.

In stead of breaking a mirror and re-using the sub-mirrors, lucreate can set up the SVM meta-devices, for example:

# lucreate -c Sol10u5 -n Sol10u8 -m /:/dev/md/dsk/d100:mirror,ufs -m /:/dev/dsk/c1d0t0s3,d12:attach

Comparing to the previous example you will notice that the device specifier in the second field of the second -m option lists a physical disk slice as well as a name for a new meta-device. You will also notice that the only tag is “attach” because the new device doesn't need to be detached, and can't be “preserved” since it doesn't have any data.

Option 8: If you have your root file systems mirrored with Veritas Volume manager, and there is no other free space large enough to hold a root file system, then I suggest that you manually break the mirror to free up a disk, rather than try to use the vxlu* scripts.

I have not personally had access to a VXVM based system in years but from the rough time many people apparently have, based on the questions I see in forums, I would recommend that you un-encapsulate, perform the upgrade, and then finally then re-encapsulate.

Option 9: If you have some rarely accessed data on disk you may have the option of temporarily moving that onto another system or even a tape in order to free up space. After completing the upgrade you can restore this data.

Option 10: Move 3rd party applications and remove crash dumps, stale log files, old copies of Recommended patch clusters, and the likes, to other disks or file systems. This actually isn't a separate recommendation – it is something you should be doing in any case, in addition to any other options you use. This should be "Option # 0"

These files, if residing in the critical file systems, will be copied unless you expressly exclude them.
With ZFS root and snapshots it is less of an issue – the snapshot doesn't duplicate data until a change is written to a file. This however could create the reverse of the problem: An update to “shared” files that lives in a critical file system, will not be replicated back to the original BE because data in a cloned file system is treated as not shared!

You probably can not exclude your applications or software directories, so instead do this: First move the application directory to any shared file system. Then create a soft-link from the location where the directory used to be, to where you moved it to. I have yet to encounter an application that will not allow itself to be relocated in this way, and can confirm that it works fine for SAP, Oracle applications, Siebel and SAS, as well as many other “infrastructure” software components, like Connect Direct, Control-M, TNG, Netbackup, etc.

A few more notes:

  1. Swap devices are shared between boot environments if you do not specifically handle them. In most cases this default behaviour should be sufficient.
  2. If you have /var as a separate file system, it will be “merged” it into the root file system, unless expressly specified with a separate -m option. This is true for all the critical file systems: root, /var, /usr and /opt
  3. On the other hand, all shareable, non-critical file systems are handled as shared by default. This means they will not be copied, merged, or have any changes done to them, and will be used as is and in place.
  4. To merge a non-critical file system into its parent, use the special “merged” device as as the target in the -m option. For example will merge /home into the root
    # lucreate -c Sol10u5 -n Sol10u8 -m /:/dev/md/dsk/d100:ufs -m /home:merged:ufs
    

In this article I have not really spoken about the actual upgrade that happens after the new BE is created. I've posted about it in the past and in most cases it is already well documented on the web.


Another very interesting subject is what happens during luactivate! I'll leave that for another day! It is a real pity that oracle is depreciating Live upgrade, but it will still be around for a while.

Saturday, March 3, 2012

Sun said it first

It takes the x86-based market years to figure out the truths of what Sun Microsystems have been saying for years, in each case. The latest example is here, I quote from the second paragraph:
There's a category of server applications that can be better served by a lower class of good enough computing, delivering much better power efficiency. Content web servers, similar to what we use at AnandTech, don't present a hugely complex workload but they do see lots of threads and have largely variable performance requirements. SeaMicro's technology reduces power consumption by using lower power CPUs and highly power optimized motherboards.
Doesn't that sound just a little too similar to what Sun have been saying since for ever? I wonder whether anybody else anywhere did as much innovation as Sun did? What a failure to market - almost every cellphone comes out with Java, yet almost nobody knows who created it. That, and Sun's habit of giving everything away for free, is why it exist no more. Good products alone doesn't make a company successful - the company needs to be able to turn those products into a profit.

Wednesday, March 30, 2011

Maintaining the Linux device driver code base

After a (sadly) failed attempt to convert my significant other to Linux, I had a discussion with her about why it failed. Root cause.

Her computer works well with Windows, not at all with Linux. The reason is that her laptop will display no better than 800x600 resolution as there is no good SIS671 graphics driver for Linux (and there is for Windows). Nothing recent, functional, supported, viable or workable.

Why isn't there one for Linux?

Because it doesn't make money. Business is the process of converting time into money. Sales people get customers to buy a product or service. Technical people produce the products or deliver the services. Management and administration functions supported and enables the business to operate as a whole. (Or so the theory goes, but that is another story)

And because programmers also need a place to live. And to feed the kids.

There is cost involved – an investment, and there is a price, the return on the investment. A product, in this case a device driver for the graphics processor, needs to be designed, produced and supported. The technical people and the tools they need to do this do not come cheap.

Device drivers for Linux, however, does not make much, if any money for the companies involved. People do not pay for device drivers, rather they (rightly) expect it to be included in the cost of the hardware.

Even closed source Linux drivers are free – the vendor have to cover the costs through the sales of hardware. But the business model is flawed – The cost to deliver the Linux device driver far exceeds the income generated from hardware sales to Linux users. Thus this expense must be subsidised from sales to Windows users.

Unless the Linux user base grows to reach a critical mass, the point where enough Linux users buy the hardware to be able to justify the cost of the driver development and support, the situation will not change.

The above situation is the same, no, actualy worse for other hardware – Webcams, GPS'es, Cell-phones, USB thumb drives, bluetooth hardware, Wi-Fi and network cards, Fingerprint readers and touch-pad input devices. Every single bit of hardware.

The Linux kernel includes almost all device drivers for the hardware because of this situation. It is the only way the Linux community can use most of the consumer hardware available in the world today – that is, by developing the needed device drivers themselves.

As a result Linux supports much much more hardware than Windows does. Windows depends on the driver disks that ships with the hardware because Microsoft does not provide driver software for every bit of hardware out there!

The more you think about it, the more you realise just how unbalanced the situation really is! Microsoft sells its Windows operating system with only basic device drivers included – for proper functionality, features and performance, you need to load the hardware manufacturer's drivers. The hardware manufacturers provide the device drivers because otherwise they would lose the majority of their market – Windows users.

The Linux community, an entity that makes no money, needs to provide device drivers created through donated effort. I am aware of the exceptions, but that does not change the overall picture. The effort to maintain and update the base of device drivers included in the Linux kernel increases as the number of pieces of hardware to be supported increases. In other words: Every time a new piece of hardware appears in the shops.

To add insult to injury, the Linux community locks themselves in with the GPL license, which means they can not, for example, utilise and share effort by other Unix or BSD distributions because the Linux kernel enforces the use of the restrictive GPL.

Even worse, a Linux device driver works only on a specific release of the kernel. This is because the kernel interfaces for device drivers changes, and as a result the device driver needs to be re-compiled for every update, even minor updates, to the kernel. The amount of extra work this would place on hardware manufacturers to ensure that their device driver works on every kernel version is significant, and much more than what is needed for, for example, Windows or Solaris.

The long and the short of it is that to produce and maintain device drivers for Linux is prohibitively expensive, and the market loss as a result of not supporting Linux users is essentially negligible to most hardware manufacturers' bottomline!

Regarding the market share situation: I have long held the belief that through “allowing” us to copy Windows, Bill Gates got the world to using MS Windows. It is what most people grew up with on our computers at home, and what we as a result expected when we entered the workplace. More than just the majority of the work force, today's computer gamer is tomorrow's IT business decision maker.

But there is some light on the horizon: The Wayland Display Server may just give the Linux graphics stack the performance boost it needs to make it a viable gaming platform, which in turn will gain it the adoption of many gamers, and in the long run more market share on the desktop. Now if only Linus would fix the device driver ABIs and APIs to make it that bit easier for hardware manufacturers to support their device driver software on Linux...

There is a lot of fud on the net about how the "deliberately dynamic ABIs" of the Linux kernel makes Linux drivers better maintained, less buggy, etc. Sigh.