Aligning stack variables in D - stack

In D, you can align struct/class members by using the align keyword, e.g.:
struct Vec4 { align(16) float[4] elems; }
However, it appears that you can't do the same on the stack:
void foo()
{
align(16) float[4] vec; // error: found 'align' instead of statement
}
Is there a way to align data on the stack? In particular, I want to create an 16-byte aligned array of floats to load into XMM registers using movaps, which is significantly faster than movups.
e.g.
void foo()
{
float[4] v = [1.0f, 2.0f, 3.0f, 4.0f];
asm
{
movaps XMM0, v; // v must be 16-byte aligned for this to work.
...
}
}

If you are willing to burn an extra 16 bytes you can do the alignment your self at run time. Aside from that, I wouldn't know.

Related

Creating safe overlapping/union fields in structs

In C++, I can create structures like these:
union Vector4
{
struct { float x, y, z, w; };
float data[4];
};
so I can easily access the data as fields or as an contiguous array. Alternatively, I can just create a pointer to the first field x and read from the pointer as an contiguous array.
I know that there are enums, but I can't pay for the additional overhead. I also know I can create unions in Rust, but they require me to litter my code with unsafe where ever I'm accessing them. Which I feel I shouldn't have to since the code is not unsafe as the underlying data is always represented as floats (and I need the C-layout #[repr(C)] so the compiler won't throw around the order of the fields).
How would I implement this in Rust so that I can access the fields by name but also have easy and safe access to the whole struct's contiguous memory? If this is not possible, is there a way so I can safely take a slice of a struct?
There is no such thing as a safe union. Personally, I would argue that transmuting between fixed sized arrays of integer types should be considered safe, but at the moment there are no exceptions.
That being said, here is my totally 100% not a union Vector4. As you can see, Deref works to hide the unsafe code and makes it so you can treat Vector4 as either a struct or an array based on the context it is used in. The transmute also isn't ideal, but I feel like I can justify it in this case. If you choose to do something like this, then you may also want to implement DerefMut as well.
use std::ops::Deref;
// I'm not sure if the repr(C) is needed in this case, but I added it just in case.
#[repr(C)]
pub struct Vector4<T> {
pub x: T,
pub y: T,
pub z: T,
pub w: T,
}
impl<T> Deref for Vector4<T>
where
T: Copy + Sized,
{
type Target = [T; 4];
fn deref(&self) -> &Self::Target {
use std::mem::transmute;
unsafe { transmute(self) }
}
}
pub fn main() {
let a = Vector4{
x: 37,
y: 21,
z: 83,
w: 94,
};
println!("{:?}", &a[..]);
// Output: [37, 21, 83, 94]
}

How to get geometric properties of bodies in Drake?

I have a .urdf file specifying a group of obstacles, that I add to my simulation like this:
drake::systems::DiagramBuilder<double> builder;
auto [plant, scene_graph] =
drake::multibody::AddMultibodyPlantSceneGraph(&builder, 0.0);
drake::multibody::Parser parser(&plant, &scene_graph);
auto obstacles = parser.AddModelFromFile("obstacles.urdf");
plant.WeldFrames(
plant.world_frame(), plant.GetFrameByName("ground", obstacles));
Where all the obstacles are fixed to a the "ground" object in the .urdf file by fixed joints, and to stop everything from falling down I welded the ground to the world frame.
All the obstacles are boxes, and I need to extract the coordinates of the vertices of the boxes. My plan was to get the width, depth and height properties of each of the Box shapes, in addition to their origins, and from this calculate the vertices (assuming none of the boxes are rotated). So far I have tried using plant.GetBodiesWeldedTo() and then plant.GetCollisionGeometriesForBody(), but I have not been able to extract the properties that I need.
How would one go about getting the vertices (or the position of the origin and width, depth and height) of the objects, after importing them into Drake?
Introspecting geometry is not straightforward. It requires "shape reification". Essentially, you have access to Shape but to find out what specific type of Shape it is, you run it through a ShapeReifier. An example of that can be found in ShapeName which takes a generic shape and returns its name. What you want is something to extract box dimensions.
To extract pose, you'll also need access to a QueryObject from SceneGraph. To do that, you'll need to evaluate SceneGraph's output port with an appropriate Context. I'm not going to focus on acquiring the QueryObject (we can open up a new stackoverflow question on that if necessary).
It's a multi-stage thing. We'll assume you started with MultibodyPlant::GetCollisionGeometriesForBody...
class BoxExtractor : public ShapeReifier {
public:
explicit BoxExtractor(const Shape& shape) { shape.Reify(this); }
std::optional<Box> box() const { return box_; }
private:
void ImplementGeometry(const Box& box, void*) override { box_ = box; }
void ImplementGeometry(const Capsule&, void*) override {}
void ImplementGeometry(const Cylinder&, void*) override {}
void ImplementGeometry(const Convex&, void*) override {}
void ImplementGeometry(const Ellipsoid&, void*) override {}
void ImplementGeometry(const HalfSpace&, void*) override {}
void ImplementGeometry(const Mesh&, void*) override {}
void ImplementGeometry(const Sphere&, void*) override {}
std::optional<Box> box_{};
};
const QueryObject<double>& query_object = ....; // got it somehow.
const auto& inspector = scene_graph.model_inspector();
const auto& collision_geometries = plant.GetCollisionGeometriesForBody(some_body);
for (const GeometryId id : collision_geometries) {
std::optional<Box> box = BoxExtractor(inspector.GetShape(id)).box();
if (box) {
const RigidTransformd& X_WB = query_object.X_WG(id);
const Vector3d v_WBx = X_WB.rotation().col(0);
const Vector3d v_WBy = X_WB.rotation().col(1);
const Vector3d v_WBz = X_WB.rotation().col(2);
const Vector3d& p_WBo = X_WB.translation();
vector<Vector3d> corners;
const Vector3d half_size{box->width(), box->depth(), box->height()};
for (const double x_sign : {-1., 1.}) {
for (const double y_sign : {-1., 1.}) {
for (const double z_sign : {-1., 1.}) {
corners.emplace_back(x_sign * half_size(0) * v_WBx +
y_sign * half_size(1) * v_WBy +
z_sign * half_size(2) * v_WBz);
}
}
}
// Do stuff with the eight corners.
}
}
(I'm about 98% sure the code will work as is... I typed it on the fly and can't guarantee it'll just copy and paste.)
Edit --
I realized I only answered half of your question. How to get box dimensions. You still want to know where the box is so you can compute box vertices. I've modified the example code to do so.
I know Python isn't the language for this question, but I would still like to complement Sean's answer with the Python API ;)
Here's an example of doing SceneGraph-only shape introspection:
Unofficial example (for a basic Rviz usage): drake_ros1_hacks/_ros_geometry.py - includes Python's version of the reifier (if type(shape) == ...)
Along the lines of exercising the mapping between SceneGraph and MultibodyPlant in Python:
Drake unittest: plant_test.py, MBP <-> SG queries
Drake tutorials: MultibodyPlant rendering, making rendering labels

16 bit logic/computer simulation in Swift

I’m trying to make a basic simulation of a 16 bit computer with Swift. The computer will feature
An ALU
2 registers
That’s all. I have enough knowledge to create these parts visually and understand how they work, but it has become increasingly difficult to make larger components with more inputs while using my current approach.
My current approach has been to wrap each component in a struct. This worked early on, but is becoming increasingly difficult to manage multiple inputs while staying true to the principles of computer science.
The primary issue is that the components aren’t updating with the clock signal. I have the output of the component updating when get is called on the output variable, c. This, however, neglects the idea of a clock signal and will likely cause further problems later on.
It’s also difficult to make getters and setters for each variable without getting errors about mutability. Although I have worked through these errors, they are annoying and slow down the development process.
The last big issue is updating the output. The output doesn’t update when the inputs change; it updates when told to do so. This isn’t accurate to the qualities of real computers and is a fundamental error.
This is an example. It is the ALU I mentioned earlier. It takes two 16 bit inputs and outputs 16 bits. It has two unary ALUs, which can make a 16 bit number zero, negate it, or both. Lastly, it either adds or does a bit wise and comparison based on the f flag and inverts the output if the no flag is selected.
struct ALU {
//Operations are done in the order listed. For example, if zx and nx are 1, it first makes input 1 zero and then inverts it.
var x : [Int] //Input 1
var y : [Int] //Input 2
var zx : Int //Make input 1 zero
var zy : Int //Make input 2 zero
var nx : Int //Invert input 1
var ny : Int //Invert input 2
var f : Int //If 0, do a bitwise AND operation. If 1, add the inputs
var no : Int //Invert the output
public var c : [Int] { //Output
get {
//Numbers first go through unary ALUs. These can negate the input (and output the value), return 0, or return the inverse of 0. They then undergo the operation specified by f, either addition or a bitwise and operation, and are negated if n is 1.
var ux = UnaryALU(z: zx, n: nx, x: x).c //Unary ALU. See comments for more
var uy = UnaryALU(z: zy, n: ny, x: y).c
var fd = select16(s: f, d1: Add16(a: ux, b: uy).c, d0: and16(a: ux, b: uy).c).c //Adds a 16 bit number or does a bitwise and operation. For more on select16, see the line below.
var out = select16(s: no, d1: not16(a: fd).c, d0: fd).c //Selects a number. If s is 1, it returns d1. If s is 0, it returns d0. d0 is the value returned by fd, while d1 is the inverse.
return out
}
}
public init(x:[Int],y:[Int],zx:Int,zy:Int,nx:Int,ny:Int,f:Int,no:Int) {
self.x = x
self.y = y
self.zx = zx
self.zy = zy
self.nx = nx
self.ny = ny
self.f = f
self.no = no
}
}
I use c for the output variable, store values with multiple bits in Int arrays, and store single bits in Int values.
I’m doing this on Swift Playgrounds 3.0 with Swift 5.0 on a 6th generation iPad. I’m storing each component or set of components in a separate file in a module, which is why some variables and all structs are marked public. I would greatly appreciate any help. Thanks in advance.
So, I’ve completely redone my approach and have found a way to bypass the issues I was facing. What I’ve done is make what I call “tracker variables” for each input. When get is called for each variable, it returns that value of the tracker assigned to it. When set is called it calls an update() function that updates the output of the circuit. It also updates the value of the tracker. This essentially creates a ‘copy’ of each variable. I did this to prevent any infinite loops.
Trackers are unfortunately necessary here. I’ll demonstrate why
var variable : Type {
get {
return variable //Calls the getter again, resulting in an infinite loop
}
set {
//Do something
}
}
In order to make a setter, Swift requires a getter to be made as well. In this example, calling variable simply calls get again, resulting in a never-ending cascade of calls to get. Tracker variables are a workaround that use minimal extra code.
Using an update method makes sure the output responds to a change in any input. This also works with a clock signal, due to the architecture of the components themselves. Although it appears to act as the clock, it does not.
For example, in data flip-flops, the clock signal is passed into gates. All a clock signal does is deactivate a component when the signal is off. So, I can implement that within update() while remaining faithful to reality.
Here’s an example of a half adder. Note that the tracker variables I mentioned are marked by an underscore in front of their name. It has two inputs, x and y, which are 1 bit each. It also has two outputs, high and low, also known as carry and sum. The outputs are also one bit.
struct halfAdder {
private var _x : Bool //Tracker for x
public var x: Bool { //Input 1
get {
return _x //Return the tracker’s value
}
set {
_x = x //Set the tracker to x
update() //Update the output
}
}
private var _y : Bool //Tracker for y
public var y: Bool { //Input 2
get {
return _y
}
set {
_y = y
update()
}
}
public var high : Bool //High output, or ‘carry’
public var low : Bool //Low output, or ‘sum’
internal mutating func update(){ //Updates the output
high = x && y //AND gate, sets the high output
low = (x || y) && !(x && y) //XOR gate, sets the low output
}
public init(x:Bool, y:Bool){ //Initializer
self.high = false //This will change when the variables are set, ensuring a correct output.
self.low = false //See above
self._x = x //Setting trackers and variables
self._y = y
self.x = x
self.y = y
}
}
This is a very clean way, save for the trackers, do accomplish this task. It can trivially be expanded to fit any number of bits by using arrays of Bool instead of a single value. It respects the clock signal, updates the output when the inputs change, and is very similar to real computers.

What is UnsafeMutablePointer<Void>? How to modify the underlying memory?

I am trying to work with SpriteKit's SKMutableTexture class but I don't know how to work with UnsafeMutablePointer< Void >. I have a vague idea that it is a pointer to a succession of byte data in memory. But how can I update it? What would this actually look like in code?
Edit
Here is a basic code sample to work with. How would I get this to do something as simple as create a red square on the screen?
let tex = SKMutableTexture(size: CGSize(width: 10, height: 10))
tex.modifyPixelDataWithBlock { (ptr:UnsafeMutablePointer<Void>, n:UInt) -> Void in
/* ??? */
}
From the docs for SKMutableTexture.modifyPixelDataWithBlock:
The texture bytes are assumed to be stored as tightly packed 32 bpp, 8bpc (unsigned integer) RGBA pixel data. The color components you provide should have already been multiplied by the alpha value.
So, while you’re given a void*, the underlying data is in the form of a stream of 4x8 bits.
You could manipulate such a structure like so:
// struct of 4 bytes
struct RGBA {
var r: UInt8
var g: UInt8
var b: UInt8
var a: UInt8
}
let tex = SKMutableTexture(size: CGSize(width: 10, height: 10))
tex.modifyPixelDataWithBlock { voidptr, len in
// convert the void pointer into a pointer to your struct
let rgbaptr = UnsafeMutablePointer<RGBA>(voidptr)
// next, create a collection-like structure from that pointer
// (this second part isn’t necessary but can be nicer to work with)
// note the length you supply to create the buffer is the number of
// RGBA structs, so you need to convert the supplied length accordingly...
let pixels = UnsafeMutableBufferPointer(start: rgbaptr, count: Int(len / sizeof(RGBA))
// now, you can manipulate the pixels buffer like any other mutable collection type
for i in indices(pixels) {
pixels[i].r = 0x00
pixels[i].g = 0xff
pixels[i].b = 0x00
pixels[i].a = 0x20
}
}
UnsafeMutablePointer<Void> is the Swift equivalent of void* - a pointer to anything at all. You can access the underlying memory as its memory property. Typically, if you know what the underlying type is, you'll coerce to a pointer to that type first. You can then use subscripting to reach a particular "slot" in memory.
For example, if the data is really a sequence of UInt8 values, you could say:
let buffer = UnsafeMutablePointer<UInt8>(ptr)
You can now access the individual UIInt8 values as buffer[0], buffer[1], and so forth.

HLSL float array packing in constant buffer?

people.
I have a problem passing a float array to vertex shader (HLSL) through constant buffer. I know that each "float" in the array below gets a 16-byte slot all by itself (space equivalent to float4) due to HLSL packing rule:
// C++ struct
struct ForegroundConstants
{
DirectX::XMMATRIX transform;
float bounceCpp[64];
};
// Vertex shader constant buffer
cbuffer ForegroundConstantBuffer : register(b0)
{
matrix transform;
float bounceHlsl[64];
};
(Unfortunately, the simple solution here does not work, nothing is drawn after I made that change)
While the C++ data gets passed, due to the packing rule they get spaced out such that each "float" in the bounceCpp C++ array gets into a 16-byte space all by itself in bounceHlsl array. This resulted in an warning similar to the following:
ID3D11DeviceContext::DrawIndexed: The size of the Constant Buffer at slot 0 of the Vertex Shader unit is too small (320 bytes provided, 1088 bytes, at least, expected). This is OK, as out-of-bounds reads are defined to return 0. It is also possible the developer knows the missing data will not be used anyway. This is only a problem if the developer actually intended to bind a sufficiently large Constant Buffer for what the shader expects.
The recommendation, as being pointed out here and here, is to rewrite the HLSL constant buffer this way:
cbuffer ForegroundConstantBuffer : register(b0)
{
matrix transform;
float4 bounceHlsl[16]; // equivalent to 64 floats.
};
static float temp[64] = (float[64]) bounceHlsl;
main(pos : POSITION) : SV_POSITION
{
int index = someValueRangeFrom0to63;
float y = temp[index];
// Bla bla bla...
}
But that didn't work (i.e. ID3D11Device1::CreateVertexShader never returns). I'm compiling things against Shader Model 4 Level 9_1, can you spot anything that I have done wrong here?
Thanks in advance! :)
Regards,
Ben
One solution, albeit non optimal, is to just declare your float array as
float4 bounceHlsl[16];
then process the index like
float x = ((float[4])(bounceHlsl[i/4]))[i%4];
where i is the index you require.

Resources