编程技术、软件应用与系统模拟

(Programming, Applicaiton and Simulation)



本站目录

 

首页
ASP/Access/IIS
DELPHI/PASCAL
PASCAL高级编程
C语言编程实例
WORD
Excel
MATLAB
MINITAB讲座
Windows
DOS
SAS
生物系统模拟
土壤水分剖析器
其他



镜像站点

 

主站
北美镜象站
欧洲镜象站(1)
欧洲镜象站(2)

本站 Google

[搜索]  [站内导航]
座右铭:
只做有益人类的事
不做有害人类的事


Extracting data from the searching results of citation database

Zhanshan Dong

Today (Sept. 25, 2002), my major advisor asked me to extract citation data from one text
file he created and sent me through email. After I looked at the data, I realized that the
data came from the citation databases, such as AGRICOLA, Biological Abstract, CAB, and so
on. The requirements that you asked me to do are listed in the following:

(1) Eliminate the (many) duplicates from the list;
(2) Send me a Word file containing citations & abstracts for just the articles that have
abstracts (as soon as 1 is done);
(3) Make a list with all citations & abstracts in chronological order;
(4) Make a list of all citations only in normal, bibliographic order.

At the very biginning, I had no idea that I fullfilled the tasks. I might take a long time
to do it starting from the original text file. In order to accelerate the processing
procedure, I tried to write a small VBA program in Excel to extract the data from the text
file. During I went home to cook the dinner, I took a little time to write a VBA subroutine
to achieve the function. But the filename must be write to the program before running the
macro. After dinner, I spent one hour to write a little bit complex program with userform
to get information from users to costomize some parameters. At this time point, the program
is more general than before.

Later on, when I debugged the program, I found that it could not fullfill the specific
function for extracting information to different columns. I went to K-State libray website
to open the database and searched the database again and saved the results with tags to
file. Then I use this file to extract the citaiton data with a midification of program, I
got a very satisfied results.

After extraction of data from text file, I used Excel function to sort the data and
eliminat the duplicate citations. Now I had the raw data to generate the different types of
data to meet the four requirements.

I opened WORD and created mail-merging main document with the Excel file as the data
source. I quickly formated the merging rules and generated two final documents for my major
professor. The whole procedure took me about 3 hours.

The source codes are listed in the following.

'Source code for userform
'========================
Private Sub btnCancel_Click()
    frmImport.Hide
End Sub

Private Sub btnImport_Click()
Dim filename
Dim fs, a

filename = tbFilename.Text
Set fs = CreateObject("Scripting.FileSystemObject")
If fs.FileExists(filename) Then
    Set a = fs.OpenTextFile(filename, 1)
    Row = 1
    Do While Not (a.AtEndofStream)
        Line = a.readline
        pos = InStr(Line, "Record ")
        If pos = 1 Then
            If cbSpaceLine.Value Then
                Line = a.readline
            End If
            Line = a.readline
            i = 1
            If Not cbTag.Value Then
                Do Until Line = ""
                    Cells(Row, i) = Line
                    Line = a.readline
                    i = i + 1
                Loop
            Else
                auline = ""
                tiline = ""
                abline = ""
                soline = ""
                pyline = ""
                Do Until Line = ""
                    fieldname = Left$(Line, 2)
                    Line = Right$(Line, Len(Line) - 5)
                    Select Case fieldname
                    Case "AU"
                        auline = Line
                    Case "TI"
                        tiline = Line
                    Case "AB"
                        abline = Line
                    Case "SO"
                        soline = Line
                    Case "PY"
                        pyline = Line
                    End Select
                    Line = a.readline
                Loop
                If Not (cbIgnore.Value) Then
                    Cells(Row, 1) = auline
                    Cells(Row, 2) = pyline
                    Cells(Row, 3) = tiline
                    Cells(Row, 4) = soline
                    Cells(Row, 5) = abline
                ElseIf (cbIgnore.Value And abline <> "") Then
                    Cells(Row, 1) = auline
                    Cells(Row, 2) = pyline
                    Cells(Row, 3) = tiline
                    Cells(Row, 4) = soline
                    Cells(Row, 5) = abline
                End If
            End If
            If Not (cbIgnore.Value) Then
                Row = Row + 1
            ElseIf abline <> "" Then
                Row = Row + 1
            End If
        End If
    Loop
    a.Close
    Set a = Nothing
End If
Set fs = Nothing

If cbElimDups.Value Then Module1.deletedups

frmImport.Hide
Unload frmImport
End Sub


Sub deletedups()
' The rocords with same title are thought as duplicate records.
' This macro keep one of them and delete others.
Selection.Sort Key1:=Range("A1"), Order1:=xlAscending, Header:=xlGuess, _
     OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom
Row = 1
Do Until Cells(Row, 3) = ""
    curtitle = LCase(Cells(Row, 3))
    i = Row + 1
    done = False
    Do Until done
        newtitle = LCase(Cells(i, 3))
        If newtitle = curtitle Then
            Rows(CStr(i) & ":" & CStr(i)).Select
            Selection.Delete Shift:=xlUp
        Else
            done = True
        End If
    Loop
    Row = Row + 1
Loop
End Sub

If you want to use the code, you can download the source code following the link at the
bottom of thsi page. After importing the code (frmImport.frm and module1.bas) to your VBA
project, click Tools->Macro->Macros to open "Macro" dialog, then select macro
"ExtractCites" and click Run button, a dialog will pop up. Enter the file name which
include your data, check the proper checkbox items, click "Import Citations" to run the
program. All the data will be extracted to the current active worksheet.
     Download source code

© 1998-, 董占山, 版权所有。
转载文章请注明出处(www.sunfinedata.com/articles)。