December 27, 2007

Visual Basic Yesterday, Today and Tomorrow

A month or two ago, Paul Yuknewicz and I sat down to record a Hansselminutes podcast with Scott Hanselman, talking about the past, present and future of Visual Basic. It was a lot of fun, check it out!


Also, here’s a little holiday love for VB from some Microsofties you might recognize:

Related Posts

(The silent majority…) (What the heck is "VBx"?) (Hello Visual Basic developers’ center!) (After MIX, how many Visual Basic languages are there?) (Lang .NET 2008, Scripting, and Visual Basic

(Almost) final VB 9.0 language specification posted

I wanted to let people know that an (almost) final VB 9.0 language specification has been posted on the download center. The spec is missing some copy-edits from the documentation folks, but is otherwise complete. Since I’m not going to get a chance to incorporate the copy-edits until I am back from vacation in January, I wanted to get the spec out there for anyone interested in documentation of the XML features that weren’t present in the previous version of the spec. (I apologize for the lateness of this vis-a-vis the release of the product itself, it’s been a busy fall.)

This updated language specification corresponds to Visual Studio 2008 and covers the following major new features:


  • Friend assemblies (InternalsVisibleTo)
  • Relaxed delegates
  • Local type inferencing
  • Anonymous types
  • Extension methods
  • Nullable types
  • Ternary operator
  • Query expressions
  • Object initializers
  • Expression trees
  • Lambda expressions
  • Generic type inferencing
  • Partial methods
  • XML Members
  • XML Literals
  • XML Namespaces

Questions, comments or criticisms can be sent to basic@microsoft.com. Thanks!

Related Posts

(VB language spec 8.0 now available…) (Language Specification: Useful? Not?) (Beta VB 9.0 language specification released…) (The big "D"-word) (Checking Cache-Coherence Protocols With TLA+

December 26, 2007

Insulate Your Code with the Provider Model

Find out how to protect your code from changes by taking advantage of the Provider Model, which lets you swap components at run time.

Related Posts

(MSDN Architecture Webcast: ADO and SQL Web Services (Level 200)) (The OSI Model: Understanding the Seven Layers of Computer Networks) (Common Data Model and the CMDB) (Exploit new data access functionality in SQL Server 2005) (The ASP.NET Page Object Model

December 25, 2007

Axandra’s Christmas SEO crossword puzzle

Instead of our regular weekly articles, news and facts, we decided to publish a Christmas crossword puzzle about search engine optimization this week. Have fun! :-)

Related Posts

(Axandra’s Christmas SEO crossword puzzle) (Imminent arrivals…) (Latin as a prerequisite for programming?) (Talk at Yale: Part 2 of 3

Axandra’s Christmas SEO crossword puzzle

Instead of our regular articles, news and facts, we’ve decided to publish a Christmas crossword puzzle about search engine optimization this week. Have fun! :-)

Related Posts

(Axandra’s Christmas SEO crossword puzzle) (Imminent arrivals…) (Latin as a prerequisite for programming?) (Talk at Yale: Part 2 of 3

December 24, 2007

Travel Tips


Inc.com published my list of travel tips from the World Tour. You’ll learn how we completely avoided air travel snafus, what equipment we brought along, and more.

Not loving your job? Visit the Joel on Software Job Board: Great software jobs, great people.

Related Posts

(Serving The Web: Nine Tips to Enhance IIS Security) (Insider tips for your Yahoo rankings) (TripIt is awesome) (The Baker’s Dozen: 13 Productivity Tips for ADO.NET 2.0) (The Baker’s Dozen: 13 Productivity Tips for Generating PowerPoint Presentations

Getting Lookout to run on Outlook 2007 again

The search feature in Microsoft Outlook 2007, frankly, sucks big time.

It’s slow. Searches take about 30 seconds for me. (I  have about 10 years of email).

You have to wait for it to fail to find things in your inbox before you’re permitted to search elsewhere, even if you know the message isn’t in your inbox.

The search quality is atrocious. I regularly get 50% garbage results mixed in that have nothing in common with my search terms, and the message I am looking for often doesn’t come up.

The patch helps, but it still takes around 30 seconds to do a search.

It didn’t use to be that way. A few years ago, there was a great add-in called Lookout for Outlook, based on Lucene.NET. Searches always took less than a second.

The tiny company that made Lookout was bought by Microsoft. It must have been one of those HR acquisitions, because the Lookout technology was thrown away. Mike Belshe only spent a couple of years at Microsoft before moving on.

When Outlook 2007 came out, it disabled Lookout, and allegedly this wasn’t supposed to be a big deal because Outlook 2007 has search “built in.” But the built-in search is, as mentioned, ghastly.

Last week I had finally had enough. I can’t work like this. I spent some time searching on the net and found that the original author of Lookout, Mike Belshe, had just posted instructions for getting Lookout to work on Outlook 2007.

They worked! Lookout is back!

It’s fast! The first search takes about a second. After that something seems to be cached in memory and further searches appear as fast as you hit the “enter” key.

Not loving your job? Visit the Joel on Software Job Board: Great software jobs, great people.

Related Posts

(Outlook 2007: downgrade no longer) (Microsoft, Please Teach Me) (MS08-015 - Critical: Vulnerability in Microsoft Outlook Could Allow Remote Code Execution (949031) - Version:1.1) (MS07-056 - Critical: Security Update for Outlook Express and Windows Mail (941202) - Version:2.0) (MS06-016: Cumulative Security Update for Outlook Express (911567) - Version:1.0

December 20, 2007

What is the longest part of Innodb Recovery Process ?

In MySQL 4.1 and above the longest part of recovery after crash for Innodb tables could be UNDO stage - it was happening in foreground and was basically unbound - if you have large enough transaction which needed to be undone this could take long hours.
REDO stage on other hand always could be regulated by size of your Innodb log files so you could have it as large as you like. Read more about it here.

Since MySQL 5.0 the UNDO stage is running in background so it still can be the longest but would not keep server completely unusable (some limitations still apply though).

In the case I’ve been working on recently none of these parts was the longest one.
The server had about 65000 tables using innodb_file_per_table so “InnoDB: Reading tablespace information from the .ibd files…” stage was taking most of the time.

Happily Innodb only needs to scan .ibd files when it was not shut down correctly otherwise restarts would be even more painful.

Even more longest phase has to do with restarts more than crash recovery as it presents in normal restarts as well - “Opening Tables”. As Innodb has to recompute the stats first time it opens the table this can take significant amount of time. Plus worst of all there is serialization in the table cache and only one table can be opened at the time as of MySQL 5.0

It would be great if Innodb would finally optionally store stats, same as MyISAM so one could recompute them in background. Also MySQL should fix things so more than one table can be opened at the same time (though I have not tested if it is still the case with 5.1 which as table_cache code rewritten dramatically)


Entry posted by peter |
No comment

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Related Posts

(Magic Innodb Recovery self healing) (How Innodb flushes data to the disk ?) (Innodb Second Start prevention bug ?) (Innodb Recovery Update - The tricks what failed.) (Innodb crash recovery update

Large result sets vs. compression protocol

mysql_connect() function in PHP’s MySQL interface (which for reference maps to mysql_real_connect() function in MySQL C API) has a $client_flags parameter since PHP 4.3.0. This parameter is barely known and almost always overlooked but in some cases it could provide a nice boost to your application.

There’s a number of different flags that can be used. We’re interested in a specific one, MYSQL_CLIENT_COMPRESS. This flag tells the client application to enable compression in the network protocol when talking to mysqld. It reduces network traffic but at the cost of some CPU time: server has to compress the data and client has to decompress it. So there’s little sense in using it if your Web application is on the same host as the database.

When the database is on a dedicated server then compression essentially means trading CPU time (on both server and client) for network time. Obviously, if the network is fast enough, the benefit in network time will not outweight the loss in CPU time. The question is, where exactly does the border lie?

It turns out that 100 Mbit link (with 1.4 ms round-trip time) is not fast enough. Oleksandr Typlynski, one of the Sphinx users, has conducted a benchmark, indexing 600 MB of data over 100 Mbit link. The data was textual and compressed well, reducing traffic more than 3 times. With compression, total indexing time reduced to 87 sec from 127 sec. That’s almost 1.5x improvement in total run time. MySQL query time improvement is even greater. On the other hand 1 Gbit link was fast enough; and total run time was 1.2x times worse with compression.

The bottom line: if you’re fetching big result sets to the client, and client and MySQL are on different boxes, and the connection is 100 Mbit, consider using compression. It’s a matter of adding one extra magic constant to your application, but the benefit might be pretty big.


Entry posted by shodan |
3 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Related Posts

(MySQL net_write_timeout vs wait_timeout and protocol notes) (Compress Binary Messages Using SQL) (The Cable Guy: The Authenticated Internet Protocol) (MySQL Blob Compression performance benefits) (Architecture and Data Model of a WebDAV-Based Collaborative System

December 19, 2007

MVCC: Transaction IDs, Log Sequence numbers and Snapshots

MySQL Storage Engines implementing Multi Version Concurrency Control have several internal identifiers related to MVCC. I see a lot of people being confused what they are and why they are needed so I decided to take a time to explain it a bit. This is general explanation, it does not corresponds to Innodb in particular and some implementation can be different but I hope this will let you to understand MVCC a bit better.

Transaction ID As the name says this is transaction identifier. It can be used by the engine for many things - for lock handling to see which transaction holds the lock and possibly kill it in case of deadlock, for proper isolation mode handling - transaction should see its own uncommitted changes but other transactions typically do not see them as well as MVCC implementation. It also can be used for recovery so you can see to which transaction given change corresponds so you can roll back or redo changes for transaction depending if it was committed or not.

Log Sequence Number (LSN) Log Sequence Numbers correspond to given position in the log files and typically incremented for each log record. Innodb uses number of bytes ever written to the log files however it could be something different. LSNs are often extensively used in recovery check pointing and buffer management operations. When checkpoint (both fuzzy and not) happens you get something like “all changes up to LSN=X are now flushed to the data space” this means you can discard or archive logs for LSN earlier than that. When doing log recovery checking LSN in the log record can tell if you this change needs to be applied or it already was applied (when doing recovery you do not know which dirty pages were flushed from the buffer pool).

The LSN do not relate much to transactions - changes from different transactions are intermixed in the log files and many LSNs can correspond to changes from the same transaction.

Snapshot Versions This is the term which is often used to to identify what will be visible for given transaction/query, however MVCC is usually implemented without having something like snapshot number.

Indeed if you look at it in details visibility for transaction is more complicated when some given “snapshot” - transaction tends to see its own changes, even if they are not committed.

What happens instead is - when transaction is started, the list of concurrently running transactions (not committed) is memorized.

For each row more than one version can be stored, each tagged with transaction IDs which added/modified and deleted the given version. This data can be compressed/optimized to keep less information if possible, ie you do not really need to store transaction ID for the row which is so old it is visible to old running transaction anyway.

So what you can get for given row is something like:

Value 5 created by transaction 100
Value 4 created by transaction 50
Value 3 created by transaction 10, deleted by transaction 20.

This tracks most recent object history which tells you it was created as value 3 by some transaction when was deleted and there were no such row (ie with given PK) for some time and when couple of new versions were created.

When looking at the row in MVCC environment system has to determinate which of the current row versions is visible. This can be done as simple as traversing the list of version down or something more complicated.

In the described case lets consider transaction 101 which is running in REPEATABLE-READ isolation mode and which was started before transaction 100 have committed.

In such case transaction 101 should not see changes done by transaction 100 and will read value 4 created by transaction 50.

We also need to track when given transaction has committed, in the list of transactions which were active at the time current transaction was started. This is needed for example to handle READ-COMMITTED isolation mode.

Lets look at transaction 102 which was started in READ-COMMITTED isolation mode before 100 was committed. Doing first read of this row it will read 50, now if it reads the row second time, after transaction 100 was committed it will see this row because it will know transaction 100 was committed when given read operation was started.

Note: These are far from all functions which MVCC storage engines may place on these identifiers. This is not extensive list but just to give you some examples to understand why do you need all of them.


Entry posted by peter |
No comment

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Related Posts

(When to use ORDER in Sequences in PL/SQL) (SQL Q+A: Clusters, Snapshots, Log Shipping, and More) (Improving Service Levels Through Transaction Visibility) (A Numbers Game to Test Your SQL Skills) (Convert Numbers to Excel Column Names
« Previous entries