Isuue 5 - Autumn Newsletter 2007

back

Welcome to another quarterly(ish) edition of the Hillside Computer Services Newsletter.
Things have been very quiet in the computer world, with not many new developments to speak of. So after a hard think and several coffees later I thought I’d tell you about the Internet, Search Engines and how searches work.

So, here goes . . . .


The Internet - How Does It Work

So, I have this computer on my desk. It plugs into the wall and into a phone socket and that's how the Internet works. Right?

Wrong. Yes, the power source is necessary for the computer and the phone socket may be what you use to get connected to the Internet. But that isn't how it works. That's just what gains you admission via your Internet Provider e.g. bt.com or aol.com.

It's All About Protocol
Probably, you know that the most commonly interface with the Internet via e-mail or the World Wide Web (www). That's not all of the Internet. It's just the popular and, relatively speaking, the new part. The Internet has a much longer history (try a search on the Internet for more information).

The Internet is based on a series of standard technical protocols (rules) which allow various computers located around the world to access specified files on other computers and then view those files. Specifically, the protocol in question is TCP/IP. TCP/IP allows computes to describe data to one another over a network. Every computer hooked to the Internet understands these two protocols and so can communicate amicably.
TCP/IP is, as that fancy little slash mark suggests, actually two separate things that work together. TCP - transmission control protocol - takes the information you want to send over the Internet and breaks it down into small chunks of data called "packets." IP - Internet Protocol - takes over and routes those packets through computers to get them to their destination. When the packets arrive at the destination computer, TCP reassembles them into something recognizable.
These two protocols allow information to be addressed, routed, and reassembled. You use this technology every single time you use the web.

There are other protocols involved too. STMP - simple, text mail protocol - works with e-mail. FTP - file transfer protocol - is essential for uploading and downloading files to and from other computers. That familiar http in your browser's location or address bar stands for hypertext transfer protocol.

All of these protocols ensure that the computers attempting to communicate with each other - through e-mail or web pages or any other mechanism - understand each other.

And that provides the technology necessary for your computer to hook into the Internet. No. An ISP (Internet Search Provider) is required.

All those protocols allow computers to communicate. You have a computer. You want to play along. Typically, this is how it works.

Your computer connects to an Internet Service Provider. You may be dialling up or you may be using high bandwidth method, such as DSL or cable, to connect. Your computer connects to your ISP's server. Once there, your ISP provides you with the gateway to connect to any other computer that has opened itself up to the world.

When you type in a domain - such as www.hillsidecomputers.co.uk - that domain is translated into a number - the IP address - and you are taken to that specific computer. Once there, your web browser allows you to look at specific files. These files can include programming, text, pictures, sound, or video in various combinations.

And E-mail?

Do you want an analogy? You write a letter and take it to the postbox. It sits in the postbox until the postman collects it. The address on the front of the envelope directs the letter to a specific person at a specific location. It's then posted in the addressee's letterbox and it sits there until the addressee picks up their letters from off the hall mat.

Which is basically how e-mail works. You address the email - myfriend@somewhere.co.uk - the part before the "@" signifies the person; the part after the "@" the server. Your e-mail sits in a queue on a computer - called a server - at your ISP. Your ISP sends out all the queued up e-mails. Those e-mails go to the specified servers and then are routed to the specified users. The e-mail sits there until the user goes online to pick up her mail.

That's It
When you access the Internet, you are simply using a series of protocols that have been developed so that you can view, download, and send and receive data from a computer that isn't yours. Pretty cool, eh?

So you’ve used the Internet, you’ve performed web searches haven’t you? And you’ve just read how the Internet works. So here goes for an easy explanation of how web searches are performed

Three That Are One

Crawler-based search engines are made up of three major elements: the spider, the index, and the software. Each has its own function and together they produce what we have come to trust (or distrust) on the SERPs (Search Engine Results Pages) i.e. when you have done a search for, for example Toyota cars in Google you get masses of page listings.

The Hungry Spider

Also known as a web crawler or robot, a search engine spider is an automated program that reads web pages and follows any links to other pages within the site. This is often referred to as a site being "spidered" or "crawled". There are three main, very hungry and active spiders on the Net. Their names are Googlebot (Google), Slurp (Yahoo!) and MSNBot (MSN Search) amongst .

Spiders start their journeys with a list of page URLs that have previously been added to their index (database). As it visits these pages, crawling the code and copy, it adds new pages (links) that it finds on the page to its index. As such, one could refer to a spider as feeding an evolving index, this is discussed below.
The spider returns to the sites in it’s index on a regular basis, scanning for any changes. How often the spider returns is up to the search engine providers to decide.

The Growing Index

An index is like a giant catalogue or inventory of websites containing a copy of every web page and file that the spider finds. If a web page changes, this catalogue is updated with the new information. To give you an idea of the size of these indexes, the latest figure released by Google is 8 billion pages.

It sometimes takes a while for new pages or changes that the spider finds to be added to its index. Thus, a web page may have been "spidered" but not yet "indexed." Until a page is indexed - added to the index - spidered pages will not be available to those searching with the search engine.

The Performing Search Engine

At the end of the day a search engine is a software program designed to sift through billions of pages recorded in its index to find matches to a search query and rank them in an order that it believes is most relevant. Quite a mouthful.

How do search engines go about determining relevancy, when confronted with hundreds of millions of web pages to sort through? Each search engine has developed a set of rules and mathematical equations, known as an algorithm, which it uses to set the order of its rankings.

Exactly how a particular search engine's algorithm works is a closely-kept secret, but some general rules are clear that are often used to increase a website's ranking performance. This is referred to as search engine optimisation.

In a nutshell, search engines use on and off page copy to group related pages into vertical themes. If we take a page relating to the film industry, these themes or groups could be entertainment, film entertainment, film star entertainment, etc. Each theme has common words and phrases that best describe the pages the group contains. Some pages may belong to more than one group. For instance, a page relating to movie profits could belong to both financial and entertainment groups.

The SERP (or Search Engine Results Page)

After applying this algorithm to their index of sites, a search engine comes up with a list of the most relevant results according to the search conducted.

To simplify an otherwise complex process, when a user enters a search query, the search engine analyses and searches its index for the pages it considers relevant to the query. Once it has a shortlist of the relevant pages, it further calculates what order they are presented to the user in based on further algorithmic factors. These could be a user's location and possibly even their search history.

This algorithm differs between engines, which is why different search engines may produce different results for the same query. Each search engine has its niche. It is however not uncommon for a user to use more than one search engine at a time. This further demonstrates the importance for website owners to be indexed and ranked well on all search engines.

Conclusion

The aim of a search engine is to put itself in its user's shoes. They therefore want to deliver appropriate, relevant, information-rich sites that will satisfy users, first time round. But it doesn’t always work.

Everyone who uses the Internet uses a search engine and I bet you that you use either Google or Yahoo, but did you know that there are oodles more which you can use.

On the plus side Google has one of the largest databases of Web pages, including many other types of web documents (blog posts, wiki pages, group discussion threads and document formats (e.g., PDFs, Word or Excel documents, PowerPoints). Despite the presence of all these formats, Google's popularity ranking often makes pages worth looking at rise near the top of search results. On the down side, not all web sites get “registered” with them so you might not find what you’re looking for (see previous article on How Search Engines Work).

Using Google alone is often not sufficient. Less than half the searchable Web is fully searchable in Google. Studies show that about half of the pages in any search engine database exist only in that database, so getting a second opinion is therefore often worth your time. For a second opinion you could try:

altavista.co.uk
ask.co.uk
lycos.co.uk
dogpile.co.uk
yahoo.co.uk

These of course can be .com’s e.g. yahoo.com or ask.com

All prefixed by “www””.

Another one to try is www.searchenginecolossus.com
This is a directory of search engines listed by country, click on the country you are in / interested in and then select the appropriate search engine.

Quite interesting results were found when I tried a variety of different search engines - sites which were at the top of one engine’s search were halfway down so to speak on a second search engine's results or not found at all.

When I first uploaded my own web site: www.hillsidecomputers.co.uk it took about 12 months for it to be “found” by Google, yet it took only a week before Yahoo listed it.

So, if you’re looking for something specific or can’t find what you’re looking for try a different search engine.


Microsoft Office

Here's a big number: 20 percent of Microsoft Office's U.S retail sales are the Mac version, according to NPD. Here's another: Mac users account for 10 percent of retail Windows Vista Business and Ultimate sales.

Those startling statistics are yet another sign of the Mac's resurgence and of the huge opportunity it creates for the largest Macintosh developer outside of Apple. Microsoft produced Mac versions of Excel and Word long before the products went to Windows. Apple's resurgence is an opportunity for Microsoft to recall those early glory days.

But should Microsoft want to? After all, Macs compete with Windows PCs. Does Microsoft really want to be in the business of supporting a competitor?
"Twenty percent of all the versions of Microsoft Office sold at retail are the Mac version," said Chris Swenson, NPD's director of software industry analysis. "Office 2004 for the Mac is selling like hot cakes."

The 20 percent number is impressive for a three-and-a-half year-old product whose successor is scheduled for a January release. The sales also greatly exceed various analyst estimates of Apple's U.S. market-share - 5 percent. The real number of users could be considerably higher.



A Brief History Of Windows Operating Systems

Year Operating System Upgrades / Sub Sets

1985 Windows 1.0
1986
1987 Windows 2.0
1988
1989
1990 Windows 3.0
1991
1992 Windows 3.1 3.1, 3.11
1993 NT 3.5 3.11 Windows For Work Groups
1994 NT 3.55
1995 Windows 95 OSR1, OSR2, OSR2.5, OSR3
1996 NT 4.0
1997
1998 Windows 98 Windows 98 2nd Edition
1999
2000 Windows ME (Milennium), Windows 2000 Professional
2001
2002 Windows XP Home, Professional, XP 64 Bit,
Media Centre
2003 Windows XP Service Pack 2 Media Centre 2003
2004
2005 Media Centre 2005
2006
2007 Windows Vista 32 & 64 Bit
2008 Windows Vista Service Pack 2

They have been busy haven’t they?

There are other Microsoft operating systems / versions but these are more business biased and aren't covered here


Computer Joke

A Microsoft support man goes to a firing range. He shoots 10 bullets at the target 50m away. Then the supervisors check the target and see that there's not even a single hit, and they shout to him that he missed completely. So he tells them to recheck, and gets the same answer. Then he put his finger at the top of the gun and shoots, blasting off his finger. When he saw it he shouted back "I don't know, it's working perfectly here, the problem must yours..."

So there you are. A few tasty morsels about the Internet, which I hope you found useful.
Don’t forget to keep your computer and software up to date by applying any necessary updates, regularlyscan your computer for viruses / nasties and of course backup your data regularly.
The next issue will be sent out in January or February next year and I hope you enjoyed the read. If not, let me know and I’ll do something about it.

Don’t forget that back issues of the Newsletters are available to download and print off in PDF format from the Hillside Computer Services web site:

Mike Hamilton

Hillside Computer Services
1, Hillside
Cross Green
Hartest
Bury St. Edmunds
Suffolk IP29 4ED

(01284) 830830 Hillside Computer Services

info@hillsidecomputers.co.uk

www.hillsidecomputers.co.uk

Hillside Computer Services