Extracting data from the searching results of citation database
Zhanshan Dong
Today (Sept. 25, 2002), my major advisor asked me to extract citation data from one text
file he created and sent me through email. After I looked at the data, I realized that the
data came from the citation databases, such as AGRICOLA, Biological Abstract, CAB, and so
on. The requirements that you asked me to do are listed in the following:
(1) Eliminate the (many) duplicates from the list;
(2) Send me a Word file containing citations & abstracts for just the articles that have
abstracts (as soon as 1 is done);
(3) Make a list with all citations & abstracts in chronological order;
(4) Make a list of all citations only in normal, bibliographic order.
At the very biginning, I had no idea that I fullfilled the tasks. I might take a long time
to do it starting from the original text file. In order to accelerate the processing
procedure, I tried to write a small VBA program in Excel to extract the data from the text
file. During I went home to cook the dinner, I took a little time to write a VBA subroutine
to achieve the function. But the filename must be write to the program before running the
macro. After dinner, I spent one hour to write a little bit complex program with userform
to get information from users to costomize some parameters. At this time point, the program
is more general than before.
Later on, when I debugged the program, I found that it could not fullfill the specific
function for extracting information to different columns. I went to K-State libray website
to open the database and searched the database again and saved the results with tags to
file. Then I use this file to extract the citaiton data with a midification of program, I
got a very satisfied results.
After extraction of data from text file, I used Excel function to sort the data and
eliminat the duplicate citations. Now I had the raw data to generate the different types of
data to meet the four requirements.
I opened WORD and created mail-merging main document with the Excel file as the data
source. I quickly formated the merging rules and generated two final documents for my major
professor. The whole procedure took me about 3 hours.
The source codes are listed in the following.
'Source code for userform
'========================
Private Sub btnCancel_Click()
frmImport.Hide
End Sub
Private Sub btnImport_Click()
Dim filename
Dim fs, a
filename = tbFilename.Text
Set fs = CreateObject("Scripting.FileSystemObject")
If fs.FileExists(filename) Then
Set a = fs.OpenTextFile(filename, 1)
Row = 1
Do While Not (a.AtEndofStream)
Line = a.readline
pos = InStr(Line, "Record ")
If pos = 1 Then
If cbSpaceLine.Value Then
Line = a.readline
End If
Line = a.readline
i = 1
If Not cbTag.Value Then
Do Until Line = ""
Cells(Row, i) = Line
Line = a.readline
i = i + 1
Loop
Else
auline = ""
tiline = ""
abline = ""
soline = ""
pyline = ""
Do Until Line = ""
fieldname = Left$(Line, 2)
Line = Right$(Line, Len(Line) - 5)
Select Case fieldname
Case "AU"
auline = Line
Case "TI"
tiline = Line
Case "AB"
abline = Line
Case "SO"
soline = Line
Case "PY"
pyline = Line
End Select
Line = a.readline
Loop
If Not (cbIgnore.Value) Then
Cells(Row, 1) = auline
Cells(Row, 2) = pyline
Cells(Row, 3) = tiline
Cells(Row, 4) = soline
Cells(Row, 5) = abline
ElseIf (cbIgnore.Value And abline <> "") Then
Cells(Row, 1) = auline
Cells(Row, 2) = pyline
Cells(Row, 3) = tiline
Cells(Row, 4) = soline
Cells(Row, 5) = abline
End If
End If
If Not (cbIgnore.Value) Then
Row = Row + 1
ElseIf abline <> "" Then
Row = Row + 1
End If
End If
Loop
a.Close
Set a = Nothing
End If
Set fs = Nothing
If cbElimDups.Value Then Module1.deletedups
frmImport.Hide
Unload frmImport
End Sub
Sub deletedups()
' The rocords with same title are thought as duplicate records.
' This macro keep one of them and delete others.
Selection.Sort Key1:=Range("A1"), Order1:=xlAscending, Header:=xlGuess, _
OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom
Row = 1
Do Until Cells(Row, 3) = ""
curtitle = LCase(Cells(Row, 3))
i = Row + 1
done = False
Do Until done
newtitle = LCase(Cells(i, 3))
If newtitle = curtitle Then
Rows(CStr(i) & ":" & CStr(i)).Select
Selection.Delete Shift:=xlUp
Else
done = True
End If
Loop
Row = Row + 1
Loop
End Sub
If you want to use the code, you can download the source code following the link at the
bottom of thsi page. After importing the code (frmImport.frm and module1.bas) to your VBA
project, click Tools->Macro->Macros to open "Macro" dialog, then select macro
"ExtractCites" and click Run button, a dialog will pop up. Enter the file name which
include your data, check the proper checkbox items, click "Import Citations" to run the
program. All the data will be extracted to the current active worksheet.
Download source code
|