******************************************************************** * Description.......: DEDUFLAG.PRG - De-duper UDF for flagging potential duplicates * Calling Samples...: * Parameter List....: * Created by........: M. Asherman 3/4/99 * Modified by.......: Nadya Nosonovsky 10/05/2000 04:19:41 PM ******************************************************************** * 3/5/99: fill out logic. * 3/6/99: add WAIT NOWAIT message for clearer in-progress indicator. * Also add a final WAIT NOWAIT message giving summary count of potential dupes flagged. * 3/12/99: change return value to more useful numeric, instead of success flag. * Adjust logic to respect any current FILTER setting, so that only records in view can be flagged. * Suppress consideration of records that already have a non-empty DedupeFlag. * Introduce #include file deduincl.h, for cleaner localization of arbitrary constants. * * This function operates against the currently selected table, which is assumed * to be the Build Master table to be deduped. Any current FILTER is respected. function deduflag * takes 2 optional arguments and returns a numeric count of records flagged (< 0 for error cases). lparameters deduby_arg, ndays_arg && 2 optional arguments * deduby_arg: name of the method of matching to be used: * "Address", "Address+Days", or "MapRef". * If omitted or empty, use "Address" as the default. * ndays_arg: (applies only for "Address+Days" matching) * Specifies the maximum difference in transaction * dates for records to be considered possible dupes. * Defaults to DDNO_DAYS (currently = 7) if omitted. local deduby, dbfname, prevtalk, prevdel, prevselec, prevorder, prevpos, ; ndays, result * include macro definitions of deduper-related constants #include deduincl.h * argument processing deduby = iif(empty(m.deduby_arg), DDMM_SITUS, m.deduby_arg) && default Match Method * argument processing ndays = iif(empty(m.ndays_arg), DDNO_DAYS, m.ndays_arg) && default Match Method * preliminary error checking dbfname = dbf() && path/name or the bldmstr file to be deduped if empty(m.dbfname) && no current Build Master table selected wait window "DEDUFLAG: " + DDEM_NOBLD + "." return DDEC_NOBLD && failure endif if used(DDAL_QUERY) && reserved alias is already in use wait window "DEDUFLAG: " + DDEM_INUSE + "." return DDEC_INUSE && failure endif * preliminary inits prevtalk = set('talk') && save initial verbosity setting prevdel = set('deleted') && save initial deletion hiding state set deleted on && hide deleted records from SQL prevselec = select(0) && save initial Build Master selection state result = 0 && init. counter of potential dupes flagged * give an in-progress indication wait window nowait "Flagging potential dupes. Please wait..." set talk on && enable query/replace progress indications in status bar * invoke SQL query to identify candidate duplicates do case && dispatch according to the method of matching case m.deduby == DDMM_SITUS && match on Situs address fields * Firstly select duplicate values select upper(ccode+town+street+str(stnum,4)+stnumext+unit) as situs, ; count(*) as cntSitus; from (m.dbfname) ; group by 1 ; having cntSitus>1 ; order by 1 into cursor (DDAL_QUERY) * Then update the BldMstr table update (dbfname) set dedupeflag= DDFC_POSS ; where upper(ccode+town+street+str(stnum,4)+stnumext+unit) ; in (select situs from (DDAL_QUERY)) m.result=_tally && Number of updated records case m.deduby == DDMM_SITDA && match on Situs fields + date within range ** Create temproral cursor myCursor select upper(ccode+town+street+str(stnum,4)+stnumext+unit) as situs, recnum, date ; from (m.dbfname) ; where dedupeflag=DDFC_NONE ; into cursor myCursor order by recnum, situs, date ** Prepare final cursor by self-join select myCursor.* ; from myCursor inner join myCursor my on myCursor.recnum<>my.recnum; and myCursor.situs=my.situs ; and abs(myCursor.date - my.date) <= m.ndays ; group by myCursor.recnum ; order by 1 into cursor (DDAL_QUERY) * check for null query result case - really just a minor optimization, but what the hell if m.result >= 0 && no errors yet result = reccount(DDAL_QUERY) && count of records that satisfied the dupes query endif * update the DedupeFlag values for all candidate dupes, respecting any current FILTER setting if m.result > 0 && there are possible dupes to be flagged result = 0 && re-init counter, to get the real filtered count * establish RELATION from query into updatable cursor on Build Master table prevorder = order(m.prevselec) && save initial Build Master ordering state prevpos = possave(m.prevselec) && save initial Build Master record position set order to recnum in (m.prevselec) && switch to ordering for RELATION set relation to recnum into (m.prevselec) && join from query into updatable cursor * look over records in query result (normally this shouldn't be very large) scan for not eof(m.prevselec) && loop over the query result set (joined to updatable cursor) replace dedupeflag with DDFC_POSS in (m.prevselec) && update possible dupe flag result = m.result + 1 && count of possible dupes actually flagged in this call endscan * post-loop cleanups set order to (m.prevorder) in (m.prevselec) && restore initial index ordering posrest(m.prevpos, m.prevselec) && restore initial position endif if used ('MyCursor') use in myCursor endif case m.deduby == DDMM_MAPRF && match on MapRef field * Firstly select duplicate values select upper(ccode+town+Mapref) as ccttmap, ; count(*) as cntMapref; from (m.dbfname) ; group by 1 ; having cntMapref>1 ; order by 1 into cursor (DDAL_QUERY) * Then update the BldMstr table update (dbfname) set dedupeflag= DDFC_POSS ; where upper(ccode+town+Mapref); in (select ccttmap from (DDAL_QUERY)) m.result=_tally && Number of updated records otherwise && unknown method of matching records wait window "DEDUFLAG: " + DDEM_UNKME + ": " + m.deduby result = DDEC_UNKME && set up for failure result endcase * clean up and return set talk &prevtalk && restore initial verbosity if used(DDAL_QUERY) if m.result >= 0 wait window nowait "Flagged " + ltrim(str(m.result)) + " potential duplicates." else wait clear && clear any residual WAIT message endif use in (DDAL_QUERY) && close (and release) the temporary SQL query cursor else wait clear && clear any residual WAIT message endif select (m.prevselec) && restore initial selection set deleted &prevdel && restore initial deletion hiding state return m.result && return count of dupes flagged, or < 0 for abnormal error cases>Hi!