Thursday, December 29, 2005

5 Minutes with Monad

I've recently spent a bit of time trying to solve some problems using Microsoft's new Monad shell. It's interesting that the creature for the O'Reilly book is the common toad. It was a different experience, although the pain is eased a little as the default installation comes with all the new commands (cmdlets) mapped to Unix ones (like ls and ps).

The default security setting prevents remotely signed objects from being executed and there seems to be no way to turn it on. The documentation is missing. To make it usable it's:
set-property
`HKLM:\SOFTWARE\Microsoft\Msh\Microsoft.Management.Automation.msh`
-property ExecutionPolicy -value RemoteSigned

Going through the tutorials it did show itself to be kind of cool. For example, being able to select the top 10 processes based on VirtualMemorySize:
get-process | sort-object VirtualMemorySize | select-object -last 10
You can whack on a "convert-HTML" or an "export-csv" to produce the result in a format you want or connect to Excel or SQL Server to retrieve data. A lot has been made of its native XML support and how it passes around strongly typed objects rather than just Unix's streams.

One of the problems was trying to do line by line processing. There was promises of pipelining via XML streams but according to "Replace lines in a text file?" (the first hit on Google) Monad doesn't support it. The lack of streaming appears to be a crucial omission in a toolset designed for system administrators - although it might not be fatal as log files and the like don't usually come close to the available memory of modern systems.

It does support accessing the .NET APIs which provides a loophole. For example, to read a file line by line and replace "xxx" with "yyy":
$f = [System.IO.File]::OpenText("c:\file.txt")
while($line = $f.ReadLine())
{
$line -replace "xxx","yyy"
}

It was all for nothing, as I later found out that it didn't support Windows 2000 and it needed to be deployed on that - it is supported by Windows XP, 2003 and Vista. Back to Windows Script Host (maybe using Ruby) I guess.

5 comments:

Paula said...

It makes sense that it doesn't do line-by-line text processing. After all, you commented on how so much has been said on its strongly typed objects.

To wrap the output of a process in an object then you really need to have all that output (or else there will be a lot of info you won't have, or can't validate). It seems to be a contradictory requirement to stream processing (read: line by line processing). I notice that your fix drops the object paradigm for managing the data.

It probably bothered the developers at MS, but there are tradeoffs with these decisions.

Andrew said...

I don't see anything contradictory about strongly typed and streaming - you simply have to be able to define a meaningful chunk of an object.

With XML it's documents, elements, etc - a good example of a streaming, strongly typed parser is XMLBeans.

With plain text files it's lines. CSV files you could do on a terminator and on a line basis. SQL it could be tables, rows, tuples, etc.

Paula said...

In this case you're getting a list of objects. Sure, the list is an object, but it's not really strongly typed, as you don't know what kind of objects it contains. You also have to do many of your operations on each element in the list, rather than on the entire list object.

Adam Barr said...

You can indeed process a line at a time. For example in the post you linked to, Bruce Payette showed how to read a file and replace text:

$data = (get-content file.htm) -replace "href","HREF"; $data >
file.htm

More generally, the get-content cmdlet puts each line of the file into the pipeline as a string, so you can do:

get-content foo.txt | where-object { $_ -like "ABC*" } { "Line starting with ABC" }

etc.

- adam

Andrew said...

The problem with the example given is mentioned in that same post too: "This example loads the entire script into memory at once so you probably
don't want to use it process huge scripts."

Or from the Monad Documentation: "The typical way to do file I/O is with the Get-Content and Set-Content Cmdlets. These deal with the entire contents of a file in a single operation."