SQL breakdown of data - Level Extreme

Level Extreme platform

Subscription

Corporate profile

Products & Services

Support

Legal

Français

SQL breakdown of data

Message

From

24/05/2017 19:35:34

Steve Elster
San Diego, California, United States

24/05/2017 07:49:25

Tamar Granor
Tomorrow's Solutions, Llc
Elkins Park, Pennsylvania, United States

General information

Forum:

Visual FoxPro

Category:

Coding, syntax & commands

Title:

Re: SQL breakdown of data

Miscellaneous

Thread ID:

01648857

Message ID:

01651441

Views:

>>>>>>>>>>Am I supposed to start the program with the first line, because I get an error when I do:
>>>>>>>>>>
>>>>>>>>>>;with counts as (select Id, count(Id) as frequency
>>>>>>>>>
>>>>>>>>>I posted you code to try in SQL Server Management Studio if you're using SQL Server.
>>>>>>>>>
>>>>>>>>>What is your back-end?
>>>>>>>>
>>>>>>>>I do a lot of work in VFP. Can you suggest a statistics package that I can use within VFP? I've never used one. Thanks, Steve
>>>>>>>
>>>>>>>I don't know of any. Someone else made that suggestion, hopefully he'll clarify.
>>>>>>
>>>>>>I have clarified this question, I now have some code in VFP that does at least part of the job, but would still like to know if this can be done solely with SQL, and in VFP? For I suspect that an SQL solution will be easier to expand to include more "levels." I start the question with a reference to "two levels."
>>>>>>
>>>>>>Consider a table with 2 columns, both numeric: The Qty column contains orange juice package sizes in gallons. The mprices columns shows the cost paid for each unit of gallon. My question: What quantity (in # of gallons) are people most often buying and what is most common price that is being paid. Both cases require dividing the numbers into ranges, and then counting what is happening for each. The result is sort of two histograms with bins: one for qty being purchased, the other for range of prices paid. The kind of report I am looking for is produced by the below code in: myHist.dbf. Apparently I cannot upload prg for dbf files so the code follows:
>>>>>
>>>>>So you want to know which Qty appears most often and, for each Qty, which price is most common. Correct. Here's a stab at it.
>>>>>
>>>>>

>>>>>SELECT Qty, COUNT(*) AS QtySold ;
>>>>>   FROM YourTable ;
>>>>>   GROUP BY Qty ;
>>>>>   INTO CURSOR csrQtys
>>>>>
>>>>>SELECT Qty, MAX(QtySold) AS SoldMost ;
>>>>>   FROM csrQtys ;
>>>>>   INTO CURSOR csrSoldMost
>>>>>
>>>>>SELECT Qty, MPrices, COUNT(*) AS PriceCount ;
>>>>>   FROM YourTable ;
>>>>>   GROUP BY Qty, MPrices ;
>>>>>   INTO CURSOR csrPriceCounts
>>>>>
>>>>>SELECT Qty, MPrices, MAX(PriceCount) AS PriceMost ;
>>>>>   FROM csrPriceCounts ;
>>>>>   GROUP BY Qty, MPrices ;
>>>>>   INTO CURSOR csrPriceMost
>>>>>

>>>>>
>>>>>We could cut that down to two queries total by making the first query in each pair into a derived table in the second. That is, we could put the first query (without the INTO clause) into the FROM of the second.
>>>>>
>>>>>Hope I understood what you were looking for and that this helps.
>>>>>
>>>>>Tamar
>>>>
>>>>Thank you! The problem remains, however, that because no prices or Qty's repeat, the counts are always 1. For, the above does not create a range of Qty's and prices. So the question is unanswered: How many times do people select, a Qty between, for instance, between .25 and 1.00 gallons and what ranges of prices do they pay for a certain Qty of juice, and how often do select a price that lies within that range of prices? In order words, one has to specify the number of bins for each field, count the occurrences within each, and then filter them to see which is the most "popular." Steve
>>>>
>>>>Examples of it being done:
>>>>1. using a function such as round() but then the values better lie within a certain range? http://blog.shlomoid.com/2011/08/how-to-quickly-create-histogram-in.html
>>>>2. https://gist.github.com/wolever/9164392
>>>>3. Or, automate excel?
>>>>
>>>>I changed the field name of mPrice to myPrices to make things easier to read. The same change would have to be made in range1.prg, (code posted above.)
>>>>
>>>>

>>>>SELECT Qty, COUNT(*) AS QtySold ;
>>>>   FROM myPrices ;
>>>>   GROUP BY Qty ;
>>>>   INTO CURSOR csrQtys
>>>>
>>>>brow
>>>>
>>>>SELECT Qty, MAX(QtySold) AS SoldMost ;
>>>>   FROM csrQtys ;
>>>>   GROUP BY Qty, QtySold ;
>>>>   INTO CURSOR csrSoldMost
>>>>
>>>>brow
>>>>
>>>>SELECT Qty, myPrices, COUNT(*) AS PriceCount ;
>>>>   FROM myPrices ;
>>>>   GROUP BY Qty, myPrices ;
>>>>   INTO CURSOR csrPriceCounts
>>>>
>>>>BROWSE
>>>>
>>>>SELECT Qty, myPrices, MAX(PriceCount) AS PriceMost ;
>>>>   FROM csrPriceCounts ;
>>>>   GROUP BY Qty, myPrices ;
>>>>   INTO CURSOR csrPriceMost
>>>>  
>>>>browse
>>>>

>>>
>>>Just went back and read the thread. Seems to me you could create a cursor with the various ranges you want to test and then group the data into those ranges and do a count. Something like this:
>>>
>>>

>>>CREATE CURSOR QtyRange (iID I, nLow N(6,2), nHigh N(6,2)
>>>INSERT INTO QtyRange VALUES (1, .25, 1)
>>>INSERT INTO QtyRange VALUES (2, 1.25,2)
>>>INSERT INTO QtyRange VALUES (3, .25, .5)
>>>INSERT INTO QtyRange VALUES (4, .51, 1)
>>>* etc., to try different sets
>>>
>>>SELECT QtyRange.iID, COUNT(*) AS RangeCount;
>>>   FROM YourTable ;
>>>      JOIN QtyRange ;
>>>          ON YourTable.Qty BETWEEN QtyRange.nLow and QtyRange.nHigh ;
>>>   GROUP BY QtyRange.iID ;
>>>   INTO CURSOR csrRangeCount
>>>

>>>
>>>And then go on from there.
>>>
>>>Tamar
>>
>>Thanks! The following gets counts for the ranges in the "first" bin for a histogram, in this case for the qty column, using the ranges in nLow through nHigh. And idea as to how to get a count for the 2nd column, namely for a range of prices? Steve
>>
>>

>>CREATE CURSOR QtyRange (iID I, nLow N(6,2), nHigh N(6,2))
>>mID = 1
>>mMin =  .1
>>mMax =  1.0
>>mDiff = mMax - mMin
>>mBins = 10
>>mIncre = mDiff/mBins
>>mStep = mIncre
>>mLow = mMin
>>FOR r = .1 TO mBins mStep	
>>	INSERT INTO QtyRange VALUES (mId, mLow, mLow + mStep)
>>	mID = mID + 1
>>	mLow = mLow + mStep
>>endfor
>>
>>SELECT qtyRange.iID, qtyRange.nLow, qtyRange.nHigh, COUNT(*) AS RangeCount;
>>   FROM qtyRange ;
>>      JOIN myPrices ;
>>          ON myPrices.Qty BETWEEN QtyRange.nLow and QtyRange.nHigh ;
>>   GROUP BY QtyRange.iID, qtyRange.nLow, qtyRange.nHigh ;
>>   INTO CURSOR csrRangeCount
>>

>
>I think you want prices by Qty, right? So use your existing data to create the relevant ranges. You can find out the current min and max for a Qty value like this:
>
>

>SELECT Qty, MIN(MyPrice) AS MinPrice, MAX(MyPrice) AS MaxPrice;
>    FROM MyPrices ;
>    GROUP BY Qty
>

>
>and then divide that range up. If you want to look at unit price rather than total price (or maybe you're already storing a unit price?), do the math in the query, and you probably don't need the Qty in there.
>
>Once you have a range, you can do the same kind of thing as for the Qty values.
>
>Tamar

Thank you, Tamar. Your suggestion to use "between" is a real eye opener for creating ranges for bins. I think it might be helpful if I create the kind of preliminary report that I am hoping to get, using SQL as versus hard-coding it with VFP, as I did above with range1 and 2. Here it is: P.S. I could not create a table---here---with columns that line up. So I have attached a snapshot that illustrates the report. It appears at the top of this message, on my screen.
Thanks for you help, Steve

Map

View

Click here to load this message in the networking platform