Someone I'm connected to on LinkedIn is sourcing a job. One of the myriad of requirements is that the person had experience with
To our Excel Gurus...
Answers
Wayne,
Broadly speaking, a worksheet with more than 100K rows could be of two types. One type could be an analysis with 100K+ rows. If that's what the employer has in mind, I definitely agree that Excel reports and analyses NEVER should be that large.
The other type of workbook could be a data workbook, which could contain Excel Tables much larger than 100K rows. (Tables with *no* calculations can approach a million rows.) Done right, this type of design would allow any number of report workbooks to reference several such data workbooks.
There are many advantages to this type of design for Excel reports. Some include...
...The data typically comes from CSV or export-to-Excel files, so there's no manual data entry, just copy and paste to a data workbook each period.
...By updating a data workbook once, you automatically update any number of reports or analyses that link to that workbook.
...Data workbooks can include sources of data not readily available to "better" analytical products, this could include data from silo systems, economic data, weather data, benchmarks, competitive data, population data, interest and exchange rates, and so on.
...Data integrity need not be an issue with this design, for several reasons: (1) Users easily can create data-scrubbing Excel reports. (2) The Excel data has an audit trail from a report, to the data workbook, to the original source of the data. (3) When many Excel users analyze the same data set, they're bound to uncover any errors that *do* exist, and those errors typically surfaced in the source data, not Excel. (4) With large data sets available, there's typically enough data for the Excel reports and analyses to be reconciled, a process that can reveal errors in both the reports *and* the source data.
...And finally, the knowledge workers who would tend to create the reports and analyses live and breathe Excel. So they have the benefits of using a product they know well to report and analyze consistent data from a reliable source.
Thanks Charley
Of course, the source request didn't specify one way or the other as to the type of data being kept.
One question. Putting ad-hoc queries aside (which is a big aside for many reasons, such as defining what ad-hoc means in the first place) isn't pre-programming a report from a database optimized to do retrieve and compute a more effective method of retrieving results?
Let me answer your question this way...
My first job after my MBA was as a cost
One day, the division's
The program was VisiCalc, and I fell in love with it. Finally, I had found a program that allowed me to combine data and my subject-matter expertise. I was no longer restricted to just dumping out historical data for my own use. Instead, spreadsheets allowed me to combine historical data *from any available source* with sophisticated calculations and (at times) assumptions to generate presentations that gave *meaning* to data.
Spreadsheets gave me the power to give senior managers the information they needed, but didn't know they could get.
Canned reports *document* past performance; good spreadsheet reports *explain* past performance and help to *predict* future performance. We call all of them "reports", but there are significant differences among them.
I remember VisiCalc. BTW, as a sidebar, the creation and life of VisiCalc is fascinating reading. Maybe one of the greatest tales of the perils of sitting on your laurels.
I guess my biggest issue with spreadsheets is when they are used as source of data, not a reporting engine.
I suspect the data source (i.e., the thing that will turn into the final analysis) is 100k+ rows. But, the end result used for analysis is probably a pivot table or other summarized report.
By the way, 100k+ rows isn't really overwhelming if a person needs to be an Excel guru. The newer version of excel allows for 1m+ rows.
This made me check the number of rows in a spreadsheet I have with sales detail from last year for a client -- it comes in at 70,000. It's basically a table with data exported from several
I would not want to work with an 100k Excel spreadsheet that contained formulas. That is a recipe for disaster if a formula goes wrong or is accidentally overwritten. It's ok if we are only talking the data portion which rolls up to another sheet with the formulas and even then you have to be very careful that those formulas are pulling the data properly. For very large data sets, I prefer to do the preliminary work in an Access database and then pull the results into Excel to do any further processing if necessary. Access is much easier to use with large amounts of data although you still have to be careful that it is consolidating the data properly.
Wayne,
>I guess my biggest issue with spreadsheets is when they are used as source of data, not a reporting engine.
In no particular order, here are the general sources of data I know of that we can use for creating spreadsheet reports:
1. Manual data entry in random cells needed for a report, or "hand-carved" reports that combine imported data with a lot of manual manipulation in the spreadsheet. These are horrible methods, because they're slow and error-prone. As a general rule, if your worksheet report contains numbers in cells, you're making a big mistake.
2. PivotTables. The good thing about pivots is that they're live-linked to external sources of data. The bad things are: You're typically limited to relational sources. You're limited to documenting historical data, not explaining it. You're limited to one view of data for each report. Few users know how to create pivots. And fewer still know how to set up pivots as an Excel database.
3. Excel-friendly OLAP. These are programs like TM1 and PowerOLAP, which provide worksheet functions that (1) return data from cubes on the server, and, (2) allow users to write data for budgets and forecasts directly from spreadsheets back to the server. These are very powerful solutions.
4. ERP-linked software. Products from F9 Software, Global Software, BizNet Software, and the like, offer worksheet functions that are live-linked to ERP data. They offer a lot of power for reporting against SAP, Sage, Microsoft Dynamics, and other ERPs.
5. Data worksheets. Using this method offers the most-flexible and lowest-cost way to live-link spreadsheet reports to reliable sources of data.
Sources 3, 4, and 5 all can be used on one spreadsheet to create one report that's live-linked to the three sources of data. Even one cell in the report could be linked to all three sources.
So, Wayne, other that source #1, which of these do you have a problem with? Or do you have additional problematic methods in mind?