home | list info | list archive | date index | thread index

Re: [OCLUG-Tech] help for new Linux users - before the club meeting

  • Subject: Re: [OCLUG-Tech] help for new Linux users - before the club meeting
  • From: Alex Pilon <alp [ at ] alexpilon [ dot ] ca>
  • Date: Mon, 6 May 2013 14:45:37 -0400
On Mon, May 06, 2013 at 10:22:51AM -0700, Rob Echlin wrote:
> I would like to have some suggestions and volunteers to present 5 minute 
> demos of "basic Linux topics", either before or after the main meeting.
> This would be in a small group around a table, not to the main group.

I don't mind helping out with a few topics here and there. I however
neither use office software, bloated substitutes for typical widespread
Mac or Windows software, etc., so I may not be able to help with that.
Just use Pandoc markdown. Be forewarned that I'm unusually minimalist at
times and despise a lot of the popular software out there. I'm not going
to preach though.

> - setting up a web server

Here's an example for a very simple web server, sthttpd (Gentoo adoption
of thttpd). I wouldn't recommend the stereotypical Apache or Nginx setup
as an example because those servers are just big, often less secure,
and more complicated than most beginners would need for just basic
hosting, unless you're doing more than very basic CGI (like setting up
MediaWiki). If you're doing the latter though, you probably aren't a
beginner or are rushing ahead too quickly. Keep things conceptually
simple for an introductory example. No need for noise/details.

Strip the comments out of course.

You need basic build tools installed (gcc, make, etc.).

	$ wget http://opensource.dyc.edu/pub/sthttpd/sthttpd-2.26.4.tar.gz # download
	$ tar -xvzf sthttpd-2.26.4.tar.gz # extract
	$ cd sthttpd-2.26.4 # go to directory where the archive contents were extracted
	$ ./configure # configure
	$ make # compile
	$ cd /home/foo # for example, to serve the directory of user foo; change to what you wish here.
	$ ./src/thttpd -p 8080 # start server on port 8080

	Point browser to http://localhost:8080. See a directory listing.
	Play around to your fancy.

	$ killall thttpd # kill web server, when done only

Of course, I can show this in person to anybody who needs a bit more
help.

> - setting up a database

Should we show how to set up a database or do very very basic relational
theory first? Are we setting up a database that is to be used by some
web app without the user understanding the transactions or is this to
use directly?

> - simple programming demo

Assuming programming knowledge or not?

> - simple shell script demo

Are we trying to show the idioms of shell and the possibilities, or very
utilitarian simple tasks?

>   - anyone got scripts they use a lot?

Despite the fact that one shouldn't parse HTML with regex (see
http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags),
for quick and dirty hack jobs, the following works most the overwhelming
majority of the time and is dead simple (not to mention UNIXy).

I'm too lazy to write a whole article on why one should{,'nt} use regex
for parsing HTML so I'll just quote and reference
http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html:

> It's considered good form to demand that regular expressions be
> considered verboten, totally off limits for processing HTML, but I
> think that's just as wrongheaded as demanding every trivial HTML
> processing task be handled by a full-blown parsing engine. It's more
> important to understand the tools, and their strengths and weaknesses,
> than it is to knuckle under to knee-jerk dogmatism.

DON'T USE THIS FOR SERIOUS/PRODUCTION CODE!!! Use a proper library. I
use python-beautifulsoup4 for real work.

Does anybody know of a simple, serverless, relatively lightweight, free,
command-line XQuery program for Linux?

Substitute "curl" for "wget -O-" if you don't have curl installed.

	# Extract value of 'src' (typically pictures — <img> tag) attributes
	# from stdin or a file
	ExtractSrc() {
		grep -h -o -e 'src="[^"]*' -e "src='[^']*" "$@" | cut -b 6-
	}

	# Same but for URL
	WWWExtractSrc() {
		for i in "$@"; do
			curl -L "$i"
		done | ExtractSrc
	}

	# Href attribute
	ExtractHref() {
		grep -h -i -o -e 'href="[^"]*' -e "href='[^']*" "$@" | cut -b 7-
	}

	WWWExtractHref() {
		for i in "$@"; do
			curl -L "$i"
		done | ExtractHref
	}

Example usage:

	$ WWWExtractHref www.google.ca
	/search?
	http://www.google.ca/imghp?hl=en&tab=wi
	http://maps.google.ca/maps?hl=en&tab=wl
	https://play.google.com/?hl=en&tab=w8
	http://www.youtube.com/?gl=CA&tab=w1
	http://news.google.ca/nwshp?hl=en&tab=wn
	https://mail.google.com/mail/?tab=wm
	https://drive.google.com/?tab=wo
	http://www.google.ca/intl/en/options/
	http://www.google.ca/history/optout?hl=en
	/preferences?hl=en
	https://accounts.google.com/ServiceLogin?hl=en&continue=http://www.google.ca/
	/advanced_search?hl=en-CA&amp;authuser=0
	/language_tools?hl=en-CA&amp;authuser=0
	http://www.google.ca/setprefs?sig=0_gfV8pKaZ5Y8nvCacQ1-DzqgLHJg%3D&amp;hl=fr&amp;source=homepage
	/intl/en/ads/
	/services/
	https://plus.google.com/108349337900676782287
	/intl/en/about.html
	http://www.google.ca/setprefdomain?prefdom=US&amp;sig=0_MBIMbCZvX4HeLz5KnLTciIkZvz4%3D
	/intl/en/policies/

Doesn't handle meta http-equiv refresh.

I sometimes WWWExtract a [listing/archive] page, xargs -L 1 curl the
output, ExtractSrc the output, then grep for a few patterns of images of
interest.

There's also this to search the telephone directory on the Exchange
server at work. Still a Windows shop, despite most our products running
Linux (or Protel like the DMS 100).

	Telephone() {
		ldapsearch -x \
				   -W \
				   -h obfuscatedDomainControllerName.genband.com \
				   -b DC=genband,DC=com \
				   -s sub \
				   -o ldif-wrap=no \
				   -D FOO\\bar \
				   -S displayName \
				   "(CN=$1*)" \
				   displayName telephoneNumber \
		  | grep -e ^displayName -e ^telephoneNumber # get rid of junk in output
	}

Example usage:

	$ Telephone Jane\ Doe
	telephoneNumber: +1 (123) 456.7890
	displayName: Jane Doe

Will show multiple results if the name is a substring of a full name
(for example "Dave" in "Dave…").

The schema used by your company may vary. I obfuscated information IS
wouldn't want me to leak. I forget off the top of my head how I guessed
the schema.

I can show a few how to use Kerberos to get single sign-on to work with
your stereotypical Windows shop, for various things like email, if
anybody's curious.

On to a more general matter, and to be pedantic and state the obvious…

if we want to help new users get started, we should also teach them to
fish rather than feeding fish, pardon the metaphor, and smoothing things
out here and there as appropriate. Training users into something skilled
takes too much time. Hopefully you won't think me insensitive for the
following; one also can't always teach attitude, learning ethic,
critical thinking, nor an open mind. One can only encourage them or make
them fun. Could also give users a list of things to look into, a roadmap
to learn of sorts, or just offer to help them understand the various
tech they use rather than learning by rote.

Cheers,

Alex Pilon