Resampling with Pandas |
The first counts all the rows from each 20 minute slot
In [11]: df1.IP.resample('20t', how='count') # I usually prefer '20min'
Out[11]:
datetime
2013-05-30 06:00:00 3
2013-05-30 06:20:00 1
dtype: int64
The second grabs those rows between certain times:
In [12]: df1.IP.between_time('06:00:00', '06:20:00')
Out[12]:
datetime
2013-05-30 06:00:41 173.199.116.171
2013-05-30 06:05:41 61.245.172.14
2013-05-30 06:10:42 74.86.158.106
Name: IP, dtype: object
There may to be a neat solution to the general problem (so you don't need
to specify the times between) using a TimeGrouper, but this is the best I
can do, to print all of the groupings:
In [13]: tg = pd.TimeGrouper('20t')
In [14]: g = df1.groupby(tg)
In [15]: def f(x):
print x
return x
In [
|
resampling or interpolation? |
There are some relationships between interpolation and resampling.
Resampling implies changing the sample rate of a set of samples. In the
case of an image, these are the pixel values sampled at each pixel
coordinate in the image. In the case of audio, these are the amplitude
values sampled at each time point.
Resampling is used to either increase the sample rate (make the image
larger) or decrease it (make the image smaller).
Interpolation is the process of calculating values between sample points.
So, if you resample an image you can use interpolation to do it. There are
a lot of interpolation methods - nearest neighbor, linear, cubic, lanczos
etc. Each method has different quality/performance.
If you reduce the sampling rate, you can get aliasing. This is where you
are trying t
|
Resampling Audio in MATLAB |
Yes, resample is your function. To downsample x from 44100 Hz to 22050 Hz:
y = resample(x,1,2);
(the "1" and "2" arguments define the resampling ratio: 22050/44100 = 1/2)
To upsample back to 44100 Hz:
x2 = resample(y,2,1);
Note that the resample function includes the necessary anti-aliasing
(lowpass) filter.
As you probably know, the "recovered" signal x2 has lost the
highest-frequency information that may have been present in x.
|
Resizing a 3D image (and resampling) |
From the docstring for scipy.ndimage.interpolate.zoom:
"""
zoom : float or sequence, optional
The zoom factor along the axes. If a float, `zoom` is the same for each
axis. If a sequence, `zoom` should contain one value for each axis.
"""
What is the scale factor between the two images? Is it constant across all
axes (i.e. are you scaling isometrically)? In that case zoom should be a
single float value. Otherwise it should be a sequence of floats, one per
axis.
For example, if the physical dimensions of whole and flash can be assumed
to be equal, then you could do something like this:
dsfactor = [w/float(f) for w,f in zip(whole.shape, flash.shape)]
downed = nd.interpolation.zoom(flash, zoom=dsfactor)
|
Resampling Image using JMagick |
It appears there are no convenience methods for both -resample and -layers
options.
The only thing in JMagick's API docs that resembles any of those options is
the method sampleImage in class MagickImage. However, that operates only in
pixels. There is indeed a setUnits method that allows you to change the
units declared in the header of the image file. But that's about it. It
doesn't modify the image itself. And there seems to be no connection
between the sampleImage and setUnits methods.
There is some code out there to resample an image using "manual"
calculations. The following snippet is based on the one available here:
MagickImage lightImg = new MagickImage (new ImageInfo (strOrigPath));
//Get the original resolution
double origXRes = lightImg.getXResolution();
double origYRes
|
R: Row resampling loop speed improvement |
I put very little thought into actually optimizing this, I was just
concentrating on doing something that's at least reasonable while matching
your procedure.
Your big problem is that you are growing objects via rbind and cbind.
Basically anytime you see someone write data.frame() or c() and expand that
object using rbind, cbind or c, you can be very sure that the resulting
code will essentially be the slowest possible way of doing what ever task
is being attempted.
This version is around 12-13 times faster, and I'm sure you could squeeze
some more out of this if you put some real thought into it:
s_size <- 200
int <- 10
reps <- 30
ss <- rep(seq(1,s_size,by = int),each = reps)
id <- rep(seq_len(reps),times = s_size/int)
foo <- function(i,j,data){
res <- data[s
|
Strange behavior of pandas resampling |
Perhaps you want interpolate instead of resample. Here's one way:
In [53]: index = pd.date_range(freq='66T', start=ts.first_valid_index(),
periods=5)
In [54]:
ts.reindex(set(ts.index).union(index)).sort_index().interpolate('time').ix[index]
Out[54]:
2011-01-02 01:00:00 0.0
2011-01-02 02:06:00 1.1
2011-01-02 03:12:00 2.2
2011-01-02 04:18:00 3.3
2011-01-02 05:24:00 4.4
Freq: 66T, dtype: float64
In [55]: index = pd.date_range(freq='65T', start=ts.first_valid_index(),
periods=5)
In [56]:
ts.reindex(set(ts.index).union(index)).sort_index().interpolate('time').ix[index]
Out[56]:
2011-01-02 01:00:00 0.000000
2011-01-02 02:05:00 1.083333
2011-01-02 03:10:00 2.166667
2011-01-02 04:15:00 3.250000
2011-01-02 05:20:00 4.333333
Freq: 65T, dtype: float64
That said,
|
|

|